1 : maize rna-seq gcns · 4 author names and affiliation: 5 ji huang, ... and the genome annotation...
TRANSCRIPT
Page | 1
Short title Maize RNA-Seq GCNs 1
Corresponding author details Dr Karen McGinnis 2
Article title Construction and Optimization of Large Gene Co-expression Network in Maize Using RNA-Seq Data 3
Author names and affiliation 4
Ji Huang Stefania Vendramin Karen M McGinnis Department of Biological Science Florida State University Tallahassee FL 5
32306 6
Lizhen Shi Department of Computer Science Florida State University Tallahassee FL 32306 7
One sentence summary Large-scale maize co-expression network from RNA-Seq data facilitates gene function and pathway 8
analysis 9
Footnotes 10
List of author contributions JH and KM designed the experiments JH conducted experiments JH and SV 11
analyzed the data JH KM and SV interpreted the data LS and JH made the website JH KM and SV wrote the article 12
Funding information National Science Foundation 13
Corresponding author email mcginnisbiofsuedu 14
15
16
Abstract 17
With the emergence of massively parallel sequencing genome-wide expression data production has reached 18
an unprecedented level This abundance of data has greatly facilitated maize research but may not be 19
amenable to traditional analysis techniques that were optimized for other data types Using publicly available 20
data a Gene Co-expression Network (GCN) can be constructed and used for gene function prediction 21
candidate gene selection and improving understanding of regulatory pathways Several GCN studies have 22
been done in maize mostly using microarray datasets To build an optimal GCN from plant materials RNA-Seq 23
data parameters for expression data normalization and network inference were evaluated A comprehensive 24
evaluation of these two parameters and ranked aggregation strategy on network performance using libraries 25
from 1266 maize samples was conducted Three normalization methods (VST CPM RPKM) and ten inference 26
methods including six correlation and four mutual information (MI) methods were tested The three 27
normalization methods had very similar performance For network inference correlation methods performed 28
better than MI methods at some genes Increasing sample size also had a positive effect on GCN Aggregating 29
single networks together resulted in improved performance compared to single networks 30
31
Introduction 32
Plant Physiology Preview Published on August 2 2017 as DOI101104pp1700825
Copyright 2017 by the American Society of Plant Biologists
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 2
Zea mays (maize) is the most widely produced crop in United States and US agriculture accounted for 36 33
of world maize production in 2015 (USDA 2016) Maize has also been in the center of the genetics research 34
for over 100 years including McClintockrsquos pioneering work with transposable elements (TEs) (reviewed by 35
(McClintock 1983 Fedoroff 2012)) Due to recent technological advances in nucleic acid sequencing and the 36
availability of the maize genome sequence (Schnable et al 2009) maize genomics research has been greatly 37
expedited 38
RNA-Sequencing (RNA-Seq) has become the favored technique for detecting genome-wide expression 39
patterns RNA-Seq has some advantages over microarray analysis of gene expression including single base 40
pair resolution detection of novel transcripts and the ability to analyze transcript abundance without existing 41
genome information (reviewed by (Wang et al 2009 Han et al 2015 Conesa et al 2016)) RNA-Seq data 42
provides information about single nucleotide polymorphisms (SNPs) which facilitates Genome-wide 43
Association Studies (GWAS) (Fu et al 2013 Li et al 2013a Lonsdale et al 2013 Fadista et al 2014) 44
Because of its widespread adaptability over five thousand Illumina platform maize RNA-Seq libraries (Fig 1A) 45
are available in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) 46
database (Leinonen et al 2010) adding to the body of data that can be used to study the maize genome 47
The maize genome is large and heterogeneous and the genome annotation is still far from complete (Mark 48
Cigan et al 2005 Ficklin and Feltus 2011) Although recent work has made substantial progress toward 49
describing genome-wide expression patterns in many genotypes environmental conditions and tissues 50
relatively little is known about the function and regulation of most maize genes Because genes with related 51
biological functions or regulatory mechanisms often have similar expression patterns (Aoki et al 2007) one 52
way to enhance understanding of gene function is by construction of a Gene Co-expression Network (GCN) 53
(Drsquohaeseleer et al 2000 Aoki et al 2007 Usadel et al 2009 Li et al 2015c Serin et al 2016) GCNs are 54
constructed using data mining tools and algorithms that describe the relatedness between the expression 55
patterns of multiple genes in a pairwise fashion 56
The use of GCNs pre-dates the availability of RNA-Seq expression data (Ficklin and Feltus 2011 Sato et al 57
2011 De Bodt et al 2012) meaning that these approaches were initiated and optimized predominantly with 58
microarray datasets Maize RNA-Seq samples are already five times more abundant than microarray (Fig 1) 59
and increasing in number meaning that an RNA-Seq oriented maize GCN protocol would be valuable to the 60
scientific community Although the initial inputs and results from microarray and RNA-Seq are similar there are 61
many differences between the data types and analytical approaches It is therefore anticipated that some 62
adjustments to GCN parameters may improve the efficacy of GCN analysis of RNA-Seq data GCN 63
construction is typically a multistep process starting with normalization of input datasets network inference 64
network evaluation and interpretation (Supplemental Fig 1) 65
Both RNA-Seq and microarrays are affected by systematic variations (Park et al 2003 Oshlack and 66
Wakefield 2009 Zheng et al 2011 Li et al 2014b) Therefore genome wide expression results generated by 67
either technique need to be normalized prior to analysis (Dillies et al 2013a Li et al 2015b) Variance 68
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 3
stabilizing transformation (VST) Counts Per Million (CPM) and Reads Per Killobase Million (RPKM) are three 69
popular normalization methods for RNA-Seq experiments (Mortazavi et al 2008 Anders and Huber 2010 70
Rau et al 2013) 71
Some work has been done to evaluate the efficacy of different normalization methods for expression analysis 72
Giorgi et al (2013) showed VST normalization of RNA-Seq data resulted in a GCN with similar characteristics 73
to a microarray-supported network in terms of coefficient and node degree distribution Normalizations with 74
CPM and using the Trimmed Mean of M-values (TMM) to adjust the composition bias between RNA-Seq 75
datasets by calculating normalization factors (Robinson et al 2010) increased the robustness of analysis 76
among diverse library sizes and compositions (Dillies et al 2013a) These studies suggest that optimizing 77
normalization methods might improve GCN performance 78
There are several methods for gene network inference including correlation mutual information (MI) Bayesian 79
network and probabilistic graphical models Typically correlation and MI methods are used for constructing 80
large-scale GCNs with more than ten thousand genes (Krouk et al 2013) Correlation methods include 81
Pearson Correlation Coefficient (PCC) Spearmans correlation coefficient (SCC) Kendall rank correlation 82
coefficient (KCC) Gini correlation coefficient (GCC) and Biweight midcorrelation (BIC) (Langfelder and 83
Horvath 2008 Kumari et al 2012 Ma and Wang 2012 Ballouz et al 2015) Cosine similarity coefficient 84
(CSC) has also been used for computing similarities in sparse datasets such as text (Dhillon and Modha 2001) 85
and protein-protein interaction data (Luo et al 2015) MI methods include Accurate Cellular Networks 86
(ARACNE) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) (Margolin 87
et al 2006 Faith et al 2007 Meyer et al 2007) The network inference method might also influence GCN 88
performance 89
Several resources are already available for GCN analysis in maize including COB (Schaefer et al 2014) 90
CORNET (De Bodt et al 2012) CoP (Ogata et al 2010) PLANEX (Yim et al 2013) and ATTED-II (Obayashi 91
et al 2009) All of databases except ATTED-II used PCC to build GCN from 128 to 379 microarray datasets 92
ATTED-II recently updated their database to provide both GCNs from microarray and RNA-Seq using PCC-93
based mutual rank (Aoki et al 2015) Although PCC is widely used there is very limited evidence that it is the 94
optimal approach for GCN analyses 95
GCNs could also be improved by meta-analysis using ranked aggregation from individual networks (Zhong et 96
al 2014 Ballouz et al 2015 Wang et al 2015a) By aggregating individual experiments only interactions 97
consistent among networks are preserved which helps reduce noise and highlights conserved interactions 98
Furthermore the ranked aggregation method provides a way to efficiently increase the size of the aggregated 99
network with newly available datasets and recalculation with all datasets is not required when a new one is 100
added This provides an efficient way to process and incorporate emerging information 101
Herein an extensive evaluation in constructing maize GCNs is reported Three parameters were tested 102
normalization method network inference algorithm and ranked aggregation method To our knowledge this is 103
the first comprehensive attempt to optimizing GCN construction using plant RNA-Seq datasets The network is 104 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 4
publicly accessible at httpwwwbiofsuedumcginnislabmcnmain_pagephp A tutorial is also provided as 105
supplemental material 106
107
Results 108
Manually Curated Maize mRNA Expression Profiling from Publicly Available Datasets 109
Recently the usage of RNA-Seq in maize has increased dramatically generating zero data entries in NCBI-110
SRA in 2008 to over nine hundred in 2016 (Fig 1B) On the contrary the most widely used Affymetrix 111
expression array for maize had 177 samples in 2008 but only 46 in 2016 (Fig 1B) GCN construction 112
approaches have not been optimized for RNA-Seq datasets in plants and doing so could improve the quality 113
and robustness of GCNs To support a comprehensive evaluation on the effect of RNA-Seq normalization 114
methods and network inference methods on the performance of GCNs maize RNA-Seq datasets were 115
compiled and processed with a computational pipeline (Supplemental Fig 1) 1266 high quality RNA-Seq 116
maize libraries from 17 different experiments were selected as input to an expression matrix The 117
corresponding experimental descriptions and publications where available of each library were manually 118
checked for sample information (Supplemental Table S1) Also a filter for reads depth and alignment rate 119
were used to remove unqualified libraries (see Methods for detail) Tissue type and haplotype from those 120
libraries were manually curated and found to include a range of sample types (Supplemental Table S1) Shoot 121
apical meristem (SAM) leaf and root were the top three most abundant tissue types but a wide range of 122
tissues were represented by multiple libraries in the dataset (Supplemental Fig 1) The dataset also included 123
multiple haplotypes although B73 represented approximately 40 of the included libraries To reduce noise 124
lowly expressed genes were removed from analysis leaving 15116 nonredundant genes across the 1266 125
libraries For comparative purposes the Affymetrix Gene Chip maize array includes 13339 genes before 126
filtering (GeneChip Maize Genome Array 127
httpwwwaffymetrixcomcatalog131468AFFYMaize+Genome+Array1_1) 128
129
Three RNA-Seq Normalization Methods Show Comparable Distribution of Expression 130
Expression data from distinct sources and experiments can be highly variable because of hybridization artifacts 131
in microarray or variable sequencing depth in RNA-Seq Many methods have been successfully used for 132
normalizing both microarray and RNA-Seq data to correct for potential biases (Lim et al 2007 Dillies et al 133
2013b Li et al 2015b) To find an optimal normalization method for building a maize GCN from RNA-Seq data 134
three widely used normalization methods were compared This included Variance Stabilizing Transformed 135
(VST) Counts Per Million (CPM) and Reads Per Killobase Per Million (RPKM) (Mortazavi et al 2008 Anders 136
and Huber 2010 Rau et al 2013) For all normalization methods log2 transformation on the normalized 137
expression values reduced the skew of the data distribution (Supplemental Fig 2) Several network studies 138
from plant RNA-Seq data used log2 transformation (Davidson et al 2011 Ma and Wang 2012 Giorgi et al 139 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 5
2013 Stelpflug et al 2015 Walley et al 2016) In our analysis genes with CPM gt 2 in more than 1000 140
samples were included This filter dramatically reduces zero count values in raw data from 30949 to 0367 141
Moreover a prior count of one was added at log2 normalization (expression = log2(CPMRPKM +1)) to avoid 142
problem with remaining zero values The log2 transformation reduced skewed distributions and extreme values 143
represented by outliers (Supplemental Fig 2) Thus we think it is important to apply log2 transformation for our 144
data 145
The distribution of gene expression across the 1266 libraries formed a bell-shaped curve with a small 146
additional peak of low expression for all three methods (Supplemental Fig 2) To determine if these low 147
expression values came from a few or multiple libraries elements within the range of expression that 148
corresponded to the observed peak (lt -37 CPM Supplemental Fig 2B) were extracted from CPM-normalized 149
expression matrix and matched to the originating libraries This demonstrated that the low expression elements 150
were not limited exclusively to specific libraries but eight libraries contributed over 25 of low elements A 151
gene ontology enrichment analysis failed to identify significant gene ontology descriptors within the subset of 152
43 genes that were defined as lowly expressed (data not shown) All eight of these libraries were from pollen 153
tissue where the average gene expression at 147 Counts Per Million (CPM) is lower than the average gene 154
expression of the other 79 tissues combined at average 183 CPM Hierarchical clustering and correlation 155
heatmap with the same data (Stelpflug et al 2015) shows the uniqueness of pollen tissue expression pattern 156
(Langfelder and Horvath 2008) (Supplemental Fig 3) When the lowly expressed elements from RPKM- and 157
VST-normalized data were analyzed to determine library origin and GO enrichment (data not shown) we found 158
similarly high level of pollen-specific libraries without significant GO categories In pollen some highly 159
expressed genes are considered orphan genes (Wu et al 2014) because they lack detectable homologs in 160
another species To investigate whether these lowly expressed genes were orphan genes their gene 161
sequences were blasted against Setaria italica genome (JGIv2) (BLASTX e-value lt 1E-03) Setaria italic 162
(foxtail millet) is a close relative to maize which diverged 234 million years ago (MYA) as estimated by 163
TimeTree (Kumar et al 2017) Only 1 out of 43 genes lacked detectable homologs in Setaria italic (data not 164
shown) indicating that the majority of these genes are not likely to be orphan genes 165
Because RPKM normalization accounts for gene length the distribution of gene length versus expression for 166
the RPKM method was compared to data normalized by VST and CPM methods VST- and CPM-normalized 167
data showed very similar overall patterns with no clear linear relationship between gene length and average 168
expression (Supplemental Fig 2C) RPKM-normalized data displayed an apparent bias toward elevated 169
expression of a small number of genes less than 5000bp in length and lower expression of long genes 170
suggesting that this normalization method might skew the distribution of expression at some genes Overall in 171
spite of these differences the three normalization methods resulted in a similar distribution of expression 172
patterns for most of the genes included in the analysis Additional analysis was completed to determine if the 173
three normalization methods influence network performance 174
175
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 6
Network Performance Does Not Differ Based Upon Normalization Method 176
To compare the efficacy of three normalization and ten inference methods a GCN was generated for each 177
combination of normalization and inference methods Furthermore all networks were rank-standardized to limit 178
the edge weight ranging from 0 to 1 (See Methods) All networks evaluations used the whole adjacency matrix 179
(1511615116 in RNA-Seq networks 1142911429 or 1786217862 in protein networks) without a cut-off 180
The performance of the different networks was measured by comparing the area under the receiver operator 181
characteristic curves (AUROC) AUROC is a measurement used to evaluate the accuracy of classification 182
models making it suitable for evaluating GCNs (Gillis and Pavlidis 2011 Ma and Wang 2012 Liu et al 2017) 183
AUROC values range from 0 to 1 with a value closer to 1 indicating that the network is discriminating 184
nonrandom patterns and perfect classification random networks returning values close to 05 and values 185
closer to 0 indicating a high degree of incorrect classification While an AUROC value close to 1 is optimal 186
values over 07 suggest good performance when analyzing large diverse networks (Gillis and Pavlidis 2011) 187
To set up the AUROC baseline for the random networks maize gene IDs were shuffled 10 (for MRNET and 188
CLR) or 1000 times (for PCC) from the normalized expression matrix The randomized expression matirx were 189
inferenced using designated alorgrithms and further evaluated The resulting AUROC values from randomized 190
networks were very close to 05 (Supplemental Table S2) 191
AUROC values were calculated and compared for three different network characteristics The first 192
characteristic was designed to test if the network identified genes with known or predicted co-expression 193
patterns based upon prior results and inclusion in two existing datasets that could serve as a positive control 194
for co-expression The maize metabolic pathway (MaizeCyc) contains 413 pathways with more than two genes 195
and was built based upon collection of evidence from genome annotation phylogenetic distance and known 196
genes in maize rice and Arabidopsis (Monaco et al 2013) The maize protein-protein interaction database 197
(PPIM) is based upon both predicted and experimentally detected protein interactions (Zhu et al 2016) and 198
was the second dataset used in this analysis Only high-confident interactions from PPIM were used as 199
defined by ranking top 5 in their model (Zhu et al 2016) For comparison with the GCN genes within the 200
same MaizeCyc or PPIM pathways were considered co-expressed The MaizeCyc and PPIM datasets were 201
combined and genes with less than 5 interactions were excluded from evaluation creating a compiled dataset 202
referred to herein as the Protein-Protein and Pathway dataset (PPPTY) PPPTY had 1720 genes and 104856 203
interactions that were used in this evaluation The AUROC value was calculated for each of the 1720 gene 204
terms 205
To assess the effect of normalization method on GCNs AUROC values for all ten inference methods were 206
averaged for each of the three normalization methods All three normalization methods scored similarly in 207
comparison with the PPPTY dataset (Fig 2B) with a mean AUROC value around 0575 for each suggesting 208
that the predicted networks were more selective than a random network 209
The second characteristic was the presence of similar gene ontology (GO) information for maize genes within 210
a detected co-expression set based upon ldquoguilt by associationrdquo that assumes specific subgroups of co-211 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 7
expressed genes have some shared functions (Wolfe et al 2005) GO annotations were downloaded from 212
AgriGO (Du et al 2010) which uses signature integration by InterPro to map gene IDs to GO terms rather 213
than co-expression data InterPro provided over 108 million stable GO terms to the functional protein 214
information database UniProtKB at release 2016_01(Sangrador-Vegas et al 2016) Thus the GO annotations 215
provide a reliable evaluation resource independent of co-expression data To assess this characteristic gene 216
ontology information was used in a neighbor voting algorithm (Gillis and Pavlidis 2011) for sets of co-217
expression matrices and compared Co-expression matrices were assessed by 3-fold cross-validation which 218
involved masking GO terms from some genes to test whether the masked GO terms could be predicted based 219
upon gene expression patterns 277 GO terms were included for this analysis 220
When GO characteristics were used to assess the networks all three normalization methods performed 221
similarly but the AUROC values were higher at around 0689 for each than those observed for comparisons 222
with PPPTY (Fig 2A) Because GO addresses gene functions and PPPTY emphasizes protein-protein 223
interactions this suggests that GCNs are better at predicting functional interactions than physical interactions 224
The p-value from one-way ANOVA for testing normalization method effect on PPPTY and GO dataset were 225
09535 and 04714 respectively confirming that the normalization method did not create a significant difference 226
in the AUROC scores associated with the GCNs for the characteristics that were tested 227
Finally proteins that regulate gene expression or modify chromatin structure might interact with the DNA of a 228
subset of co-expressed genes The interactions between such a protein and regulated DNA could be detected 229
by chromatin precipitation of associated DNA followed by DNA sequencing (ChIP-Seq) In maize there are five 230
ChIP-Seq datasets available (Bolduc et al 2012 Morohashi et al 2012 Li et al 2015a Pautler et al 2015 231
Yang et al 2016) some of which involving lowly expressed or tissue-specific genes For example Opaque2 is 232
specifically expressed in endosperm (Li et al 2015a) Knotted1 is expressed in SAM and floral tissues (Bolduc 233
et al 2012) and Pericarp Color1 has low expression except in inflorescence and seed (Morohashi et al 234
2012) Histone Deacetylase 101 (HDA101) ChIP-Seq data provided the largest dataset for comparison with 26 235
confirmed binding targets that are relatively high expressed in most maize tissues (Yang et al 2016) Histone 236
deacetylation often correlates with decreased in gene expression (Verdin and Ott 2014) High confidence 237
HDA101 targets were defined as those discovered by ChIP-Seq and that also showed increased gene 238
expression in hda101 mutant Networks associated with the 26 high confidence HDA101 targets were 239
compared by calculating AUROC Based upon this analysis the AUROC values were very similar among 240
networks normalized by VST CPM and RPKM (Fig 2C) which is consistent with GO and PPPTY evaluation 241
242
Correlation Methods Performs better than Mutual Information at Some Genes 243
After normalization of the expression matrices they can be processed by different methods for GCN inference 244
To optimize this step the AUROC values of six correlation (PCC SCC KCC GCC BIC CSC) and four mutual 245
information (MI) methods (AA MA MRNET CLR) were compared for the expression matrices that were 246
generated from each of three normalization methods (VST CPM RPKM) and then averaged In general 247 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 8
correlation methods are more computationally efficient while MI methods are able to reveal non-linear 248
relationships (Li et al 2015c) PCC is widely used but may be influenced by outliers (Mukaka 2012) SCC 249
KCC and BIC are less sensitive to outliers because SCC and KCC only consider the rank information and BIC 250
calculates based on dataset median instead of mean (Serin et al 2016) Recently GCC has been shown to 251
be a better correlation method for gene expression analysis because of its capacity to detect non-linear 252
relationships and insensitivity to outliers (Ma and Wang 2012) CSC is widely used for text mining and 253
analyzing sparse data with many zeros (Dhillon and Modha 2001) ARACNE MRNET and CLR showed 254
extended gene-dependent relationships under variable biological settings (Margolin et al 2006 Faith et al 255
2007 Meyer et al 2007 Li et al 2013b) To estimate the effectiveness of the inference methods the same 256
testing parameters with AUROC calculations were performed as described for the testing of normalization 257
methods 258
Assessed by GO datasets the 277 AUROC values were averaged to create one average value for each of the 259
10 inference methods ranging from 0620 to 0724 (Fig 2D) The average AUROC across all normalization 260
methods for six correlation methods was 0718 while the average AUROC for the all four MI methods was 261
0646 The majority of the 277 GO terms had similar AUROC values in the different correlation method-262
generated GCNs and these patterns are different from those observed in the MI-generated GCNs (Fig 3A) 263
The similarity among different methods was also detectable by pairwise comparison and comparing Pearson 264
correlations between the different methods (Supplemental Fig 4A) 265
To evaluate network inference methods with the PPPTY dataset the AUROC values for 1720 genes were 266
averaged for each combination of normalization and inference methods (Fig 2E) This evaluation also showed 267
that the networks constructed using correlation methods resulted in higher AUROC values than MI methods 268
although the CSC method resulted in lower AUROC values than other correlation methods As demonstrated 269
for the GO evaluation results from correlation methods were more similar with each other than the MI methods 270
(Supplemental Fig 4B) Interestingly heatmap results indicated that a subset of genes consistently had higher 271
AUROC values when CSC MRNETCLR or AAMA were used (Fig 3B) although this includes a small enough 272
number of genes that the average AUROC value over the whole gene set was relatively low for those methods 273
The gene sets with highest AUROC values in PCC CSC or MRNET were extracted Characteristics of each 274
gene sets were compared in average expression (CPM) and average number of low expressed elements 275
(CPM lt 0) The CSC gene set had the smallest number of low expression elements and had higher average 276
expression than both the 1720 gene set and the PCC gene set (Supplemental Fig 5) This may indicate that 277
the CSC method is better at determining co-expression for highly expressed genes 278
The AUROC values from 26 targets of HDA101 ChIP-Seq datasets reveals that CSC GCN had the highest 279
AUROC value and the use of MRNETCLR GCNs resulted in slightly higher scores than correlation methods 280
(Fig 2F) This could be explained by the small number of targets creating skewed results but may also 281
indicate that CSCMI methods are more suitable for specific types of genes or interactions between genes 282
(Tzfadia et al 2016) HDA101 is a highly expressed gene in all samples with average expression value equals 283
to 864 CPM and minimum expression equals to 289 CPM so itrsquos possible that HDA101 is more suitable for 284 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 9
CSC method CPM and RPKM normalization methods had higher AUROC values than VST (Fig 2C) Using 285
two models of ARACNE (additive-AA and multiplicative-MA) the co-expression matrices contain less than 05 286
non-zero values for all comparisons and so these techniques were not included in any additional analyses 287
In conclusion our results indicated the widely-used correlation methods resulted in a more predictive maize 288
GCN from a single expression matrix but co-expression with some individual genes may be better detected 289
using MI methods Normalization method did not have a substantial influence on GCNrsquos performance so only 290
CPM normalization was used in conjunction with PCC SCC MRNT and CLR inference for subsequent 291
optimization of other parameters 292
293
Increase Sample Size Had a Positive Effect On GCN 294
GCN analysis can be accomplished with a variable number of samples and datasets but sample size can 295
influence the quality of the resulting GCN (Wei et al 2004 Ballouz et al 2015) Separate analyses were 296
conducted with different numbers of samples and experiments to empirically determine the effect of sample 297
number on GCN effectiveness The data in our analysis consisted of 17 experiments each including between 298
12 and 404 libraries For this analysis CPM normalization method followed by each of four inference methods 299
(PCC SCC MRNET and CLR) was applied to the 17 experiments and the 68 resulting networks were 300
evaluated by both GO and PPPTY 301
From GO and PPPTY evaluation all algorithms exhibit a positive linear relationship between sample size with 302
natural logarithm transformed and average AUROC values (Fig 4) The linear relationships are stronger in 303
PCC and SCC methods with higher r-square values indicating correlation methods benefit more from 304
increasing sample size Thus for building correlation-based GCNs as many samples as possible should be 305
included We also found that as seen for the total GCN analysis PCC and SCC had higher average AUROC 306
values than the MRNET and CLR methods for PPPTY and GO analysis for most of individual networks (Fig 5) 307
308
Ranked Aggregation of Networks Improved Performance of GCNs 309
Ranked aggregation for meta-analysis can also be modified to change the outcomes of GCN by buffering the 310
effect of sample heterogeneity (Zhong et al 2014 Wang et al 2015a Asnicar et al 2016) Aggregated rank 311
standardized correlationMI matrices were calculated from separate experiments to determine if this approach 312
enhanced GCN performance Aggregating individual networks together for meta-analysis can help to highlight 313
true co-expression interactions and reduce noise (Zhong et al 2014 Wang et al 2015a Wang et al 2015b) 314
This analysis was conducted with the 17 differently sized experiments using PCC SCC MRNET and CLR 315
method for GCN inference as we did previously resulting in 68 single GCNs The 17 experiments were 316
aggregated for PCC SCC MRNET and CLR individually and evaluated by GO and PPPTY datasets 317
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 10
Of the 4 aggregated networks that were evaluated the two correlation methods (PCC and SCC) had higher 318
AUROC values than the single network from 1266 samples (Figure 6 and Supplemental Fig 6) However this 319
aggregation strategy did not result in significant higher AUROC scores for the MRNET and CLR method 320
networks compared with single networks with 1266 samples (two-tail Wilcoxon rank test for GO evaluation p-321
values 0494 and 0796) It has been reported that MI estimation accuracy is dependent on sample size (Gao 322
et al 2015) therefore individual MI networks built with a small number of libraries may not demonstrate 323
improved accuracy from aggregation In conclusion the PCCSCC-built GCN performed best using a ranked 324
aggregation strategy and use of this strategy in combination with the other optimized parameters creates a 325
robust GCN 326
327
The Performance of Protein Networks Did Not Exceed Aggregation Networks 328
In many cases mRNA levels in a cell are of interest because mRNA level is thought to be related to the level 329
and function of a protein of interest However many researchers had found inconsistencies between mRNA 330
and protein level (Baerenfaller et al 2008 Schwanhaumlusser et al 2011 Ponnala et al 2014 Walley et al 331
2016) Although relatively less protein expression data is available this data is amenable to GCN construction 332
and could represent a more direct reflection of interacting proteins Using a non-modified protein expression 333
atlas from 23 maize tissues based upon mass spectrometry data (Walley et al 2016) four protein networks 334
were built with PCC SCC MRNET and CLR separately and then evaluated using the same PPPTY and GO 335
dataset as previously mentioned 336
GCNs constructed from protein expression did not exhibit superior AUROC values to those observed for RNA-337
Seq based GCN using the aggregation strategy (Fig 6) When evaluated by GO and PPPTY dataset the 338
performance of the protein network was lower than the aggregated network as well as the single network from 339
1266 samples To confirm this result a two-way ANOVA was computed with pairwise comparison for the GO 340
evaluation which showed that the effect of network type was significant (Supplemental Table S3) A 341
subsequent pairwise comparison using Wilcoxon rank sum test indicated that PCCSCC method were 342
significantly better than MRNETCLR (Supplemental Table S3) although MI methods may be superior for 343
some types of interactions 344
The raw protein expression data included 17862 genes of which 11429 genes overlapped with our RNA-Seq-345
based network and were therefore used for the analysis To demonstrate that the performance of the protein 346
network was not biased due by the selection of genes the PCC method was used for the whole 17862 genes 347
to construct a protein network (Supplemental Fig 7) No improvement could be detected from protein network 348
derived from 17862 genes with p-value equals to 0635 for GO evaluation and 0995 for PPPTY evaluation 349
from one-sided Wilcoxon rank sum test 350
351
PCC and SCC-built GCN Exhibit Identical Topological and Functional Properties 352 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 11
In addition to evaluation of network performance based upon biological characteristics networks can be 353
compared based upon several different network characteristics including clustering coefficient number of 354
nodes network heterogeneity (Dong and Horvath 2007) network centralization (Dong and Horvath 2007) 355
number of detected modules and number of genes in largest module Number of nodes is a basic construct in 356
graph theory depicting the scale of a network Clustering coefficient and number of modules are to model how 357
densely nodes are connected in networks Heterogeneity measures the variability of node connections 358
Centralization indicates how likely some nodes have significantly more connections than average In this 359
analysis each gene corresponds with a node Based on the extensive evaluation using biological 360
characteristics like protein-protein interactions (PPPTY) and predicted gene function (GO) three final maize 361
networks were selected for comparison of basic network characteristics based on their overall performance 362
PCC and SCC-built ranked aggregation network from 17 experiments (PA and SA) MRNET-built single 363
network from 1266 total samples (MS) The three networks were constrained to include the top one million 364
predicted interactions or edges 365
In prior studies most biological networks had scale-free architectures which fit a power-law distribution 366
(Barabasi et al 2004 Doncheva et al 2012 Schaefer et al 2014) For the three final maize networks 367
constructed using optimized parameters both neighborhood connectivity distribution (Supplemental Fig 8) and 368
node degree distribution (Supplemental Fig 9) fit power-law models with r-squared values over 07 The MS 369
network had the highest network centralization value The network heterogeneity value of MS was over two 370
times that of PA and SA indicating that MS may contain more highly interacting genes (Supplemental Table 371
S4) consistent with the observed highest centralization values for this network Centralization and 372
heterogeneity are two variants to model the degree distribution of networks A scale-free network with more 373
numbers of hubs has larger values of centralization and heterogeneity while a network with larger values of 374
centralization and heterogeneity may contain a larger number of hubs or the number of hubs is not significantly 375
large but the degree distributions are extremely imbalanced In biological networks many observations 376
connected large values of centralization and heterogeneity with more hub genes (Ma and Zeng 2003 Horvath 377
and Dong 2008 Iancu et al 2012 Scott-Boyer et al 2013) even though theoretically we cannot rule out the 378
possibility that high values were result from extremely imbalanced degree distribution For the MS network 379
most highly connected genes interacted with a large number of lowly connected genes this pattern is also 380
apparent reflected in the decreasing neighborhood connectivity distribution for the MS network (Supplemental 381
Fig 8) The genes with the most interactions are expected to act as key components in GCN networks 382
(Langfelder and Horvath 2008 Allen et al 2012) and likely represent central regulators of multi-protein 383
biological processes (Ma et al 2013 Du et al 2015) The top 1000 interacting genes from all networks were 384
analyzed in more detail as these were potential ldquohubrdquo genes that may regulate other expression patterns and 385
processes PA and SA shared 95 of the top 1000 interacting genes while MS had 835 unique genes (Fig 386
7A) 148 genes were shared among all three networks (Supplemental Table S5) making these genes strong 387
candidate for central biological regulators The annotation of these genes suggests their participation in a 388
range of basic cellular process (Fig 7C) including gene expression DNA replication translation and gene 389
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 2
Zea mays (maize) is the most widely produced crop in United States and US agriculture accounted for 36 33
of world maize production in 2015 (USDA 2016) Maize has also been in the center of the genetics research 34
for over 100 years including McClintockrsquos pioneering work with transposable elements (TEs) (reviewed by 35
(McClintock 1983 Fedoroff 2012)) Due to recent technological advances in nucleic acid sequencing and the 36
availability of the maize genome sequence (Schnable et al 2009) maize genomics research has been greatly 37
expedited 38
RNA-Sequencing (RNA-Seq) has become the favored technique for detecting genome-wide expression 39
patterns RNA-Seq has some advantages over microarray analysis of gene expression including single base 40
pair resolution detection of novel transcripts and the ability to analyze transcript abundance without existing 41
genome information (reviewed by (Wang et al 2009 Han et al 2015 Conesa et al 2016)) RNA-Seq data 42
provides information about single nucleotide polymorphisms (SNPs) which facilitates Genome-wide 43
Association Studies (GWAS) (Fu et al 2013 Li et al 2013a Lonsdale et al 2013 Fadista et al 2014) 44
Because of its widespread adaptability over five thousand Illumina platform maize RNA-Seq libraries (Fig 1A) 45
are available in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) 46
database (Leinonen et al 2010) adding to the body of data that can be used to study the maize genome 47
The maize genome is large and heterogeneous and the genome annotation is still far from complete (Mark 48
Cigan et al 2005 Ficklin and Feltus 2011) Although recent work has made substantial progress toward 49
describing genome-wide expression patterns in many genotypes environmental conditions and tissues 50
relatively little is known about the function and regulation of most maize genes Because genes with related 51
biological functions or regulatory mechanisms often have similar expression patterns (Aoki et al 2007) one 52
way to enhance understanding of gene function is by construction of a Gene Co-expression Network (GCN) 53
(Drsquohaeseleer et al 2000 Aoki et al 2007 Usadel et al 2009 Li et al 2015c Serin et al 2016) GCNs are 54
constructed using data mining tools and algorithms that describe the relatedness between the expression 55
patterns of multiple genes in a pairwise fashion 56
The use of GCNs pre-dates the availability of RNA-Seq expression data (Ficklin and Feltus 2011 Sato et al 57
2011 De Bodt et al 2012) meaning that these approaches were initiated and optimized predominantly with 58
microarray datasets Maize RNA-Seq samples are already five times more abundant than microarray (Fig 1) 59
and increasing in number meaning that an RNA-Seq oriented maize GCN protocol would be valuable to the 60
scientific community Although the initial inputs and results from microarray and RNA-Seq are similar there are 61
many differences between the data types and analytical approaches It is therefore anticipated that some 62
adjustments to GCN parameters may improve the efficacy of GCN analysis of RNA-Seq data GCN 63
construction is typically a multistep process starting with normalization of input datasets network inference 64
network evaluation and interpretation (Supplemental Fig 1) 65
Both RNA-Seq and microarrays are affected by systematic variations (Park et al 2003 Oshlack and 66
Wakefield 2009 Zheng et al 2011 Li et al 2014b) Therefore genome wide expression results generated by 67
either technique need to be normalized prior to analysis (Dillies et al 2013a Li et al 2015b) Variance 68
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 3
stabilizing transformation (VST) Counts Per Million (CPM) and Reads Per Killobase Million (RPKM) are three 69
popular normalization methods for RNA-Seq experiments (Mortazavi et al 2008 Anders and Huber 2010 70
Rau et al 2013) 71
Some work has been done to evaluate the efficacy of different normalization methods for expression analysis 72
Giorgi et al (2013) showed VST normalization of RNA-Seq data resulted in a GCN with similar characteristics 73
to a microarray-supported network in terms of coefficient and node degree distribution Normalizations with 74
CPM and using the Trimmed Mean of M-values (TMM) to adjust the composition bias between RNA-Seq 75
datasets by calculating normalization factors (Robinson et al 2010) increased the robustness of analysis 76
among diverse library sizes and compositions (Dillies et al 2013a) These studies suggest that optimizing 77
normalization methods might improve GCN performance 78
There are several methods for gene network inference including correlation mutual information (MI) Bayesian 79
network and probabilistic graphical models Typically correlation and MI methods are used for constructing 80
large-scale GCNs with more than ten thousand genes (Krouk et al 2013) Correlation methods include 81
Pearson Correlation Coefficient (PCC) Spearmans correlation coefficient (SCC) Kendall rank correlation 82
coefficient (KCC) Gini correlation coefficient (GCC) and Biweight midcorrelation (BIC) (Langfelder and 83
Horvath 2008 Kumari et al 2012 Ma and Wang 2012 Ballouz et al 2015) Cosine similarity coefficient 84
(CSC) has also been used for computing similarities in sparse datasets such as text (Dhillon and Modha 2001) 85
and protein-protein interaction data (Luo et al 2015) MI methods include Accurate Cellular Networks 86
(ARACNE) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) (Margolin 87
et al 2006 Faith et al 2007 Meyer et al 2007) The network inference method might also influence GCN 88
performance 89
Several resources are already available for GCN analysis in maize including COB (Schaefer et al 2014) 90
CORNET (De Bodt et al 2012) CoP (Ogata et al 2010) PLANEX (Yim et al 2013) and ATTED-II (Obayashi 91
et al 2009) All of databases except ATTED-II used PCC to build GCN from 128 to 379 microarray datasets 92
ATTED-II recently updated their database to provide both GCNs from microarray and RNA-Seq using PCC-93
based mutual rank (Aoki et al 2015) Although PCC is widely used there is very limited evidence that it is the 94
optimal approach for GCN analyses 95
GCNs could also be improved by meta-analysis using ranked aggregation from individual networks (Zhong et 96
al 2014 Ballouz et al 2015 Wang et al 2015a) By aggregating individual experiments only interactions 97
consistent among networks are preserved which helps reduce noise and highlights conserved interactions 98
Furthermore the ranked aggregation method provides a way to efficiently increase the size of the aggregated 99
network with newly available datasets and recalculation with all datasets is not required when a new one is 100
added This provides an efficient way to process and incorporate emerging information 101
Herein an extensive evaluation in constructing maize GCNs is reported Three parameters were tested 102
normalization method network inference algorithm and ranked aggregation method To our knowledge this is 103
the first comprehensive attempt to optimizing GCN construction using plant RNA-Seq datasets The network is 104 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 4
publicly accessible at httpwwwbiofsuedumcginnislabmcnmain_pagephp A tutorial is also provided as 105
supplemental material 106
107
Results 108
Manually Curated Maize mRNA Expression Profiling from Publicly Available Datasets 109
Recently the usage of RNA-Seq in maize has increased dramatically generating zero data entries in NCBI-110
SRA in 2008 to over nine hundred in 2016 (Fig 1B) On the contrary the most widely used Affymetrix 111
expression array for maize had 177 samples in 2008 but only 46 in 2016 (Fig 1B) GCN construction 112
approaches have not been optimized for RNA-Seq datasets in plants and doing so could improve the quality 113
and robustness of GCNs To support a comprehensive evaluation on the effect of RNA-Seq normalization 114
methods and network inference methods on the performance of GCNs maize RNA-Seq datasets were 115
compiled and processed with a computational pipeline (Supplemental Fig 1) 1266 high quality RNA-Seq 116
maize libraries from 17 different experiments were selected as input to an expression matrix The 117
corresponding experimental descriptions and publications where available of each library were manually 118
checked for sample information (Supplemental Table S1) Also a filter for reads depth and alignment rate 119
were used to remove unqualified libraries (see Methods for detail) Tissue type and haplotype from those 120
libraries were manually curated and found to include a range of sample types (Supplemental Table S1) Shoot 121
apical meristem (SAM) leaf and root were the top three most abundant tissue types but a wide range of 122
tissues were represented by multiple libraries in the dataset (Supplemental Fig 1) The dataset also included 123
multiple haplotypes although B73 represented approximately 40 of the included libraries To reduce noise 124
lowly expressed genes were removed from analysis leaving 15116 nonredundant genes across the 1266 125
libraries For comparative purposes the Affymetrix Gene Chip maize array includes 13339 genes before 126
filtering (GeneChip Maize Genome Array 127
httpwwwaffymetrixcomcatalog131468AFFYMaize+Genome+Array1_1) 128
129
Three RNA-Seq Normalization Methods Show Comparable Distribution of Expression 130
Expression data from distinct sources and experiments can be highly variable because of hybridization artifacts 131
in microarray or variable sequencing depth in RNA-Seq Many methods have been successfully used for 132
normalizing both microarray and RNA-Seq data to correct for potential biases (Lim et al 2007 Dillies et al 133
2013b Li et al 2015b) To find an optimal normalization method for building a maize GCN from RNA-Seq data 134
three widely used normalization methods were compared This included Variance Stabilizing Transformed 135
(VST) Counts Per Million (CPM) and Reads Per Killobase Per Million (RPKM) (Mortazavi et al 2008 Anders 136
and Huber 2010 Rau et al 2013) For all normalization methods log2 transformation on the normalized 137
expression values reduced the skew of the data distribution (Supplemental Fig 2) Several network studies 138
from plant RNA-Seq data used log2 transformation (Davidson et al 2011 Ma and Wang 2012 Giorgi et al 139 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 5
2013 Stelpflug et al 2015 Walley et al 2016) In our analysis genes with CPM gt 2 in more than 1000 140
samples were included This filter dramatically reduces zero count values in raw data from 30949 to 0367 141
Moreover a prior count of one was added at log2 normalization (expression = log2(CPMRPKM +1)) to avoid 142
problem with remaining zero values The log2 transformation reduced skewed distributions and extreme values 143
represented by outliers (Supplemental Fig 2) Thus we think it is important to apply log2 transformation for our 144
data 145
The distribution of gene expression across the 1266 libraries formed a bell-shaped curve with a small 146
additional peak of low expression for all three methods (Supplemental Fig 2) To determine if these low 147
expression values came from a few or multiple libraries elements within the range of expression that 148
corresponded to the observed peak (lt -37 CPM Supplemental Fig 2B) were extracted from CPM-normalized 149
expression matrix and matched to the originating libraries This demonstrated that the low expression elements 150
were not limited exclusively to specific libraries but eight libraries contributed over 25 of low elements A 151
gene ontology enrichment analysis failed to identify significant gene ontology descriptors within the subset of 152
43 genes that were defined as lowly expressed (data not shown) All eight of these libraries were from pollen 153
tissue where the average gene expression at 147 Counts Per Million (CPM) is lower than the average gene 154
expression of the other 79 tissues combined at average 183 CPM Hierarchical clustering and correlation 155
heatmap with the same data (Stelpflug et al 2015) shows the uniqueness of pollen tissue expression pattern 156
(Langfelder and Horvath 2008) (Supplemental Fig 3) When the lowly expressed elements from RPKM- and 157
VST-normalized data were analyzed to determine library origin and GO enrichment (data not shown) we found 158
similarly high level of pollen-specific libraries without significant GO categories In pollen some highly 159
expressed genes are considered orphan genes (Wu et al 2014) because they lack detectable homologs in 160
another species To investigate whether these lowly expressed genes were orphan genes their gene 161
sequences were blasted against Setaria italica genome (JGIv2) (BLASTX e-value lt 1E-03) Setaria italic 162
(foxtail millet) is a close relative to maize which diverged 234 million years ago (MYA) as estimated by 163
TimeTree (Kumar et al 2017) Only 1 out of 43 genes lacked detectable homologs in Setaria italic (data not 164
shown) indicating that the majority of these genes are not likely to be orphan genes 165
Because RPKM normalization accounts for gene length the distribution of gene length versus expression for 166
the RPKM method was compared to data normalized by VST and CPM methods VST- and CPM-normalized 167
data showed very similar overall patterns with no clear linear relationship between gene length and average 168
expression (Supplemental Fig 2C) RPKM-normalized data displayed an apparent bias toward elevated 169
expression of a small number of genes less than 5000bp in length and lower expression of long genes 170
suggesting that this normalization method might skew the distribution of expression at some genes Overall in 171
spite of these differences the three normalization methods resulted in a similar distribution of expression 172
patterns for most of the genes included in the analysis Additional analysis was completed to determine if the 173
three normalization methods influence network performance 174
175
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 6
Network Performance Does Not Differ Based Upon Normalization Method 176
To compare the efficacy of three normalization and ten inference methods a GCN was generated for each 177
combination of normalization and inference methods Furthermore all networks were rank-standardized to limit 178
the edge weight ranging from 0 to 1 (See Methods) All networks evaluations used the whole adjacency matrix 179
(1511615116 in RNA-Seq networks 1142911429 or 1786217862 in protein networks) without a cut-off 180
The performance of the different networks was measured by comparing the area under the receiver operator 181
characteristic curves (AUROC) AUROC is a measurement used to evaluate the accuracy of classification 182
models making it suitable for evaluating GCNs (Gillis and Pavlidis 2011 Ma and Wang 2012 Liu et al 2017) 183
AUROC values range from 0 to 1 with a value closer to 1 indicating that the network is discriminating 184
nonrandom patterns and perfect classification random networks returning values close to 05 and values 185
closer to 0 indicating a high degree of incorrect classification While an AUROC value close to 1 is optimal 186
values over 07 suggest good performance when analyzing large diverse networks (Gillis and Pavlidis 2011) 187
To set up the AUROC baseline for the random networks maize gene IDs were shuffled 10 (for MRNET and 188
CLR) or 1000 times (for PCC) from the normalized expression matrix The randomized expression matirx were 189
inferenced using designated alorgrithms and further evaluated The resulting AUROC values from randomized 190
networks were very close to 05 (Supplemental Table S2) 191
AUROC values were calculated and compared for three different network characteristics The first 192
characteristic was designed to test if the network identified genes with known or predicted co-expression 193
patterns based upon prior results and inclusion in two existing datasets that could serve as a positive control 194
for co-expression The maize metabolic pathway (MaizeCyc) contains 413 pathways with more than two genes 195
and was built based upon collection of evidence from genome annotation phylogenetic distance and known 196
genes in maize rice and Arabidopsis (Monaco et al 2013) The maize protein-protein interaction database 197
(PPIM) is based upon both predicted and experimentally detected protein interactions (Zhu et al 2016) and 198
was the second dataset used in this analysis Only high-confident interactions from PPIM were used as 199
defined by ranking top 5 in their model (Zhu et al 2016) For comparison with the GCN genes within the 200
same MaizeCyc or PPIM pathways were considered co-expressed The MaizeCyc and PPIM datasets were 201
combined and genes with less than 5 interactions were excluded from evaluation creating a compiled dataset 202
referred to herein as the Protein-Protein and Pathway dataset (PPPTY) PPPTY had 1720 genes and 104856 203
interactions that were used in this evaluation The AUROC value was calculated for each of the 1720 gene 204
terms 205
To assess the effect of normalization method on GCNs AUROC values for all ten inference methods were 206
averaged for each of the three normalization methods All three normalization methods scored similarly in 207
comparison with the PPPTY dataset (Fig 2B) with a mean AUROC value around 0575 for each suggesting 208
that the predicted networks were more selective than a random network 209
The second characteristic was the presence of similar gene ontology (GO) information for maize genes within 210
a detected co-expression set based upon ldquoguilt by associationrdquo that assumes specific subgroups of co-211 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 7
expressed genes have some shared functions (Wolfe et al 2005) GO annotations were downloaded from 212
AgriGO (Du et al 2010) which uses signature integration by InterPro to map gene IDs to GO terms rather 213
than co-expression data InterPro provided over 108 million stable GO terms to the functional protein 214
information database UniProtKB at release 2016_01(Sangrador-Vegas et al 2016) Thus the GO annotations 215
provide a reliable evaluation resource independent of co-expression data To assess this characteristic gene 216
ontology information was used in a neighbor voting algorithm (Gillis and Pavlidis 2011) for sets of co-217
expression matrices and compared Co-expression matrices were assessed by 3-fold cross-validation which 218
involved masking GO terms from some genes to test whether the masked GO terms could be predicted based 219
upon gene expression patterns 277 GO terms were included for this analysis 220
When GO characteristics were used to assess the networks all three normalization methods performed 221
similarly but the AUROC values were higher at around 0689 for each than those observed for comparisons 222
with PPPTY (Fig 2A) Because GO addresses gene functions and PPPTY emphasizes protein-protein 223
interactions this suggests that GCNs are better at predicting functional interactions than physical interactions 224
The p-value from one-way ANOVA for testing normalization method effect on PPPTY and GO dataset were 225
09535 and 04714 respectively confirming that the normalization method did not create a significant difference 226
in the AUROC scores associated with the GCNs for the characteristics that were tested 227
Finally proteins that regulate gene expression or modify chromatin structure might interact with the DNA of a 228
subset of co-expressed genes The interactions between such a protein and regulated DNA could be detected 229
by chromatin precipitation of associated DNA followed by DNA sequencing (ChIP-Seq) In maize there are five 230
ChIP-Seq datasets available (Bolduc et al 2012 Morohashi et al 2012 Li et al 2015a Pautler et al 2015 231
Yang et al 2016) some of which involving lowly expressed or tissue-specific genes For example Opaque2 is 232
specifically expressed in endosperm (Li et al 2015a) Knotted1 is expressed in SAM and floral tissues (Bolduc 233
et al 2012) and Pericarp Color1 has low expression except in inflorescence and seed (Morohashi et al 234
2012) Histone Deacetylase 101 (HDA101) ChIP-Seq data provided the largest dataset for comparison with 26 235
confirmed binding targets that are relatively high expressed in most maize tissues (Yang et al 2016) Histone 236
deacetylation often correlates with decreased in gene expression (Verdin and Ott 2014) High confidence 237
HDA101 targets were defined as those discovered by ChIP-Seq and that also showed increased gene 238
expression in hda101 mutant Networks associated with the 26 high confidence HDA101 targets were 239
compared by calculating AUROC Based upon this analysis the AUROC values were very similar among 240
networks normalized by VST CPM and RPKM (Fig 2C) which is consistent with GO and PPPTY evaluation 241
242
Correlation Methods Performs better than Mutual Information at Some Genes 243
After normalization of the expression matrices they can be processed by different methods for GCN inference 244
To optimize this step the AUROC values of six correlation (PCC SCC KCC GCC BIC CSC) and four mutual 245
information (MI) methods (AA MA MRNET CLR) were compared for the expression matrices that were 246
generated from each of three normalization methods (VST CPM RPKM) and then averaged In general 247 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 8
correlation methods are more computationally efficient while MI methods are able to reveal non-linear 248
relationships (Li et al 2015c) PCC is widely used but may be influenced by outliers (Mukaka 2012) SCC 249
KCC and BIC are less sensitive to outliers because SCC and KCC only consider the rank information and BIC 250
calculates based on dataset median instead of mean (Serin et al 2016) Recently GCC has been shown to 251
be a better correlation method for gene expression analysis because of its capacity to detect non-linear 252
relationships and insensitivity to outliers (Ma and Wang 2012) CSC is widely used for text mining and 253
analyzing sparse data with many zeros (Dhillon and Modha 2001) ARACNE MRNET and CLR showed 254
extended gene-dependent relationships under variable biological settings (Margolin et al 2006 Faith et al 255
2007 Meyer et al 2007 Li et al 2013b) To estimate the effectiveness of the inference methods the same 256
testing parameters with AUROC calculations were performed as described for the testing of normalization 257
methods 258
Assessed by GO datasets the 277 AUROC values were averaged to create one average value for each of the 259
10 inference methods ranging from 0620 to 0724 (Fig 2D) The average AUROC across all normalization 260
methods for six correlation methods was 0718 while the average AUROC for the all four MI methods was 261
0646 The majority of the 277 GO terms had similar AUROC values in the different correlation method-262
generated GCNs and these patterns are different from those observed in the MI-generated GCNs (Fig 3A) 263
The similarity among different methods was also detectable by pairwise comparison and comparing Pearson 264
correlations between the different methods (Supplemental Fig 4A) 265
To evaluate network inference methods with the PPPTY dataset the AUROC values for 1720 genes were 266
averaged for each combination of normalization and inference methods (Fig 2E) This evaluation also showed 267
that the networks constructed using correlation methods resulted in higher AUROC values than MI methods 268
although the CSC method resulted in lower AUROC values than other correlation methods As demonstrated 269
for the GO evaluation results from correlation methods were more similar with each other than the MI methods 270
(Supplemental Fig 4B) Interestingly heatmap results indicated that a subset of genes consistently had higher 271
AUROC values when CSC MRNETCLR or AAMA were used (Fig 3B) although this includes a small enough 272
number of genes that the average AUROC value over the whole gene set was relatively low for those methods 273
The gene sets with highest AUROC values in PCC CSC or MRNET were extracted Characteristics of each 274
gene sets were compared in average expression (CPM) and average number of low expressed elements 275
(CPM lt 0) The CSC gene set had the smallest number of low expression elements and had higher average 276
expression than both the 1720 gene set and the PCC gene set (Supplemental Fig 5) This may indicate that 277
the CSC method is better at determining co-expression for highly expressed genes 278
The AUROC values from 26 targets of HDA101 ChIP-Seq datasets reveals that CSC GCN had the highest 279
AUROC value and the use of MRNETCLR GCNs resulted in slightly higher scores than correlation methods 280
(Fig 2F) This could be explained by the small number of targets creating skewed results but may also 281
indicate that CSCMI methods are more suitable for specific types of genes or interactions between genes 282
(Tzfadia et al 2016) HDA101 is a highly expressed gene in all samples with average expression value equals 283
to 864 CPM and minimum expression equals to 289 CPM so itrsquos possible that HDA101 is more suitable for 284 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 9
CSC method CPM and RPKM normalization methods had higher AUROC values than VST (Fig 2C) Using 285
two models of ARACNE (additive-AA and multiplicative-MA) the co-expression matrices contain less than 05 286
non-zero values for all comparisons and so these techniques were not included in any additional analyses 287
In conclusion our results indicated the widely-used correlation methods resulted in a more predictive maize 288
GCN from a single expression matrix but co-expression with some individual genes may be better detected 289
using MI methods Normalization method did not have a substantial influence on GCNrsquos performance so only 290
CPM normalization was used in conjunction with PCC SCC MRNT and CLR inference for subsequent 291
optimization of other parameters 292
293
Increase Sample Size Had a Positive Effect On GCN 294
GCN analysis can be accomplished with a variable number of samples and datasets but sample size can 295
influence the quality of the resulting GCN (Wei et al 2004 Ballouz et al 2015) Separate analyses were 296
conducted with different numbers of samples and experiments to empirically determine the effect of sample 297
number on GCN effectiveness The data in our analysis consisted of 17 experiments each including between 298
12 and 404 libraries For this analysis CPM normalization method followed by each of four inference methods 299
(PCC SCC MRNET and CLR) was applied to the 17 experiments and the 68 resulting networks were 300
evaluated by both GO and PPPTY 301
From GO and PPPTY evaluation all algorithms exhibit a positive linear relationship between sample size with 302
natural logarithm transformed and average AUROC values (Fig 4) The linear relationships are stronger in 303
PCC and SCC methods with higher r-square values indicating correlation methods benefit more from 304
increasing sample size Thus for building correlation-based GCNs as many samples as possible should be 305
included We also found that as seen for the total GCN analysis PCC and SCC had higher average AUROC 306
values than the MRNET and CLR methods for PPPTY and GO analysis for most of individual networks (Fig 5) 307
308
Ranked Aggregation of Networks Improved Performance of GCNs 309
Ranked aggregation for meta-analysis can also be modified to change the outcomes of GCN by buffering the 310
effect of sample heterogeneity (Zhong et al 2014 Wang et al 2015a Asnicar et al 2016) Aggregated rank 311
standardized correlationMI matrices were calculated from separate experiments to determine if this approach 312
enhanced GCN performance Aggregating individual networks together for meta-analysis can help to highlight 313
true co-expression interactions and reduce noise (Zhong et al 2014 Wang et al 2015a Wang et al 2015b) 314
This analysis was conducted with the 17 differently sized experiments using PCC SCC MRNET and CLR 315
method for GCN inference as we did previously resulting in 68 single GCNs The 17 experiments were 316
aggregated for PCC SCC MRNET and CLR individually and evaluated by GO and PPPTY datasets 317
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 10
Of the 4 aggregated networks that were evaluated the two correlation methods (PCC and SCC) had higher 318
AUROC values than the single network from 1266 samples (Figure 6 and Supplemental Fig 6) However this 319
aggregation strategy did not result in significant higher AUROC scores for the MRNET and CLR method 320
networks compared with single networks with 1266 samples (two-tail Wilcoxon rank test for GO evaluation p-321
values 0494 and 0796) It has been reported that MI estimation accuracy is dependent on sample size (Gao 322
et al 2015) therefore individual MI networks built with a small number of libraries may not demonstrate 323
improved accuracy from aggregation In conclusion the PCCSCC-built GCN performed best using a ranked 324
aggregation strategy and use of this strategy in combination with the other optimized parameters creates a 325
robust GCN 326
327
The Performance of Protein Networks Did Not Exceed Aggregation Networks 328
In many cases mRNA levels in a cell are of interest because mRNA level is thought to be related to the level 329
and function of a protein of interest However many researchers had found inconsistencies between mRNA 330
and protein level (Baerenfaller et al 2008 Schwanhaumlusser et al 2011 Ponnala et al 2014 Walley et al 331
2016) Although relatively less protein expression data is available this data is amenable to GCN construction 332
and could represent a more direct reflection of interacting proteins Using a non-modified protein expression 333
atlas from 23 maize tissues based upon mass spectrometry data (Walley et al 2016) four protein networks 334
were built with PCC SCC MRNET and CLR separately and then evaluated using the same PPPTY and GO 335
dataset as previously mentioned 336
GCNs constructed from protein expression did not exhibit superior AUROC values to those observed for RNA-337
Seq based GCN using the aggregation strategy (Fig 6) When evaluated by GO and PPPTY dataset the 338
performance of the protein network was lower than the aggregated network as well as the single network from 339
1266 samples To confirm this result a two-way ANOVA was computed with pairwise comparison for the GO 340
evaluation which showed that the effect of network type was significant (Supplemental Table S3) A 341
subsequent pairwise comparison using Wilcoxon rank sum test indicated that PCCSCC method were 342
significantly better than MRNETCLR (Supplemental Table S3) although MI methods may be superior for 343
some types of interactions 344
The raw protein expression data included 17862 genes of which 11429 genes overlapped with our RNA-Seq-345
based network and were therefore used for the analysis To demonstrate that the performance of the protein 346
network was not biased due by the selection of genes the PCC method was used for the whole 17862 genes 347
to construct a protein network (Supplemental Fig 7) No improvement could be detected from protein network 348
derived from 17862 genes with p-value equals to 0635 for GO evaluation and 0995 for PPPTY evaluation 349
from one-sided Wilcoxon rank sum test 350
351
PCC and SCC-built GCN Exhibit Identical Topological and Functional Properties 352 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 11
In addition to evaluation of network performance based upon biological characteristics networks can be 353
compared based upon several different network characteristics including clustering coefficient number of 354
nodes network heterogeneity (Dong and Horvath 2007) network centralization (Dong and Horvath 2007) 355
number of detected modules and number of genes in largest module Number of nodes is a basic construct in 356
graph theory depicting the scale of a network Clustering coefficient and number of modules are to model how 357
densely nodes are connected in networks Heterogeneity measures the variability of node connections 358
Centralization indicates how likely some nodes have significantly more connections than average In this 359
analysis each gene corresponds with a node Based on the extensive evaluation using biological 360
characteristics like protein-protein interactions (PPPTY) and predicted gene function (GO) three final maize 361
networks were selected for comparison of basic network characteristics based on their overall performance 362
PCC and SCC-built ranked aggregation network from 17 experiments (PA and SA) MRNET-built single 363
network from 1266 total samples (MS) The three networks were constrained to include the top one million 364
predicted interactions or edges 365
In prior studies most biological networks had scale-free architectures which fit a power-law distribution 366
(Barabasi et al 2004 Doncheva et al 2012 Schaefer et al 2014) For the three final maize networks 367
constructed using optimized parameters both neighborhood connectivity distribution (Supplemental Fig 8) and 368
node degree distribution (Supplemental Fig 9) fit power-law models with r-squared values over 07 The MS 369
network had the highest network centralization value The network heterogeneity value of MS was over two 370
times that of PA and SA indicating that MS may contain more highly interacting genes (Supplemental Table 371
S4) consistent with the observed highest centralization values for this network Centralization and 372
heterogeneity are two variants to model the degree distribution of networks A scale-free network with more 373
numbers of hubs has larger values of centralization and heterogeneity while a network with larger values of 374
centralization and heterogeneity may contain a larger number of hubs or the number of hubs is not significantly 375
large but the degree distributions are extremely imbalanced In biological networks many observations 376
connected large values of centralization and heterogeneity with more hub genes (Ma and Zeng 2003 Horvath 377
and Dong 2008 Iancu et al 2012 Scott-Boyer et al 2013) even though theoretically we cannot rule out the 378
possibility that high values were result from extremely imbalanced degree distribution For the MS network 379
most highly connected genes interacted with a large number of lowly connected genes this pattern is also 380
apparent reflected in the decreasing neighborhood connectivity distribution for the MS network (Supplemental 381
Fig 8) The genes with the most interactions are expected to act as key components in GCN networks 382
(Langfelder and Horvath 2008 Allen et al 2012) and likely represent central regulators of multi-protein 383
biological processes (Ma et al 2013 Du et al 2015) The top 1000 interacting genes from all networks were 384
analyzed in more detail as these were potential ldquohubrdquo genes that may regulate other expression patterns and 385
processes PA and SA shared 95 of the top 1000 interacting genes while MS had 835 unique genes (Fig 386
7A) 148 genes were shared among all three networks (Supplemental Table S5) making these genes strong 387
candidate for central biological regulators The annotation of these genes suggests their participation in a 388
range of basic cellular process (Fig 7C) including gene expression DNA replication translation and gene 389
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 3
stabilizing transformation (VST) Counts Per Million (CPM) and Reads Per Killobase Million (RPKM) are three 69
popular normalization methods for RNA-Seq experiments (Mortazavi et al 2008 Anders and Huber 2010 70
Rau et al 2013) 71
Some work has been done to evaluate the efficacy of different normalization methods for expression analysis 72
Giorgi et al (2013) showed VST normalization of RNA-Seq data resulted in a GCN with similar characteristics 73
to a microarray-supported network in terms of coefficient and node degree distribution Normalizations with 74
CPM and using the Trimmed Mean of M-values (TMM) to adjust the composition bias between RNA-Seq 75
datasets by calculating normalization factors (Robinson et al 2010) increased the robustness of analysis 76
among diverse library sizes and compositions (Dillies et al 2013a) These studies suggest that optimizing 77
normalization methods might improve GCN performance 78
There are several methods for gene network inference including correlation mutual information (MI) Bayesian 79
network and probabilistic graphical models Typically correlation and MI methods are used for constructing 80
large-scale GCNs with more than ten thousand genes (Krouk et al 2013) Correlation methods include 81
Pearson Correlation Coefficient (PCC) Spearmans correlation coefficient (SCC) Kendall rank correlation 82
coefficient (KCC) Gini correlation coefficient (GCC) and Biweight midcorrelation (BIC) (Langfelder and 83
Horvath 2008 Kumari et al 2012 Ma and Wang 2012 Ballouz et al 2015) Cosine similarity coefficient 84
(CSC) has also been used for computing similarities in sparse datasets such as text (Dhillon and Modha 2001) 85
and protein-protein interaction data (Luo et al 2015) MI methods include Accurate Cellular Networks 86
(ARACNE) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) (Margolin 87
et al 2006 Faith et al 2007 Meyer et al 2007) The network inference method might also influence GCN 88
performance 89
Several resources are already available for GCN analysis in maize including COB (Schaefer et al 2014) 90
CORNET (De Bodt et al 2012) CoP (Ogata et al 2010) PLANEX (Yim et al 2013) and ATTED-II (Obayashi 91
et al 2009) All of databases except ATTED-II used PCC to build GCN from 128 to 379 microarray datasets 92
ATTED-II recently updated their database to provide both GCNs from microarray and RNA-Seq using PCC-93
based mutual rank (Aoki et al 2015) Although PCC is widely used there is very limited evidence that it is the 94
optimal approach for GCN analyses 95
GCNs could also be improved by meta-analysis using ranked aggregation from individual networks (Zhong et 96
al 2014 Ballouz et al 2015 Wang et al 2015a) By aggregating individual experiments only interactions 97
consistent among networks are preserved which helps reduce noise and highlights conserved interactions 98
Furthermore the ranked aggregation method provides a way to efficiently increase the size of the aggregated 99
network with newly available datasets and recalculation with all datasets is not required when a new one is 100
added This provides an efficient way to process and incorporate emerging information 101
Herein an extensive evaluation in constructing maize GCNs is reported Three parameters were tested 102
normalization method network inference algorithm and ranked aggregation method To our knowledge this is 103
the first comprehensive attempt to optimizing GCN construction using plant RNA-Seq datasets The network is 104 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 4
publicly accessible at httpwwwbiofsuedumcginnislabmcnmain_pagephp A tutorial is also provided as 105
supplemental material 106
107
Results 108
Manually Curated Maize mRNA Expression Profiling from Publicly Available Datasets 109
Recently the usage of RNA-Seq in maize has increased dramatically generating zero data entries in NCBI-110
SRA in 2008 to over nine hundred in 2016 (Fig 1B) On the contrary the most widely used Affymetrix 111
expression array for maize had 177 samples in 2008 but only 46 in 2016 (Fig 1B) GCN construction 112
approaches have not been optimized for RNA-Seq datasets in plants and doing so could improve the quality 113
and robustness of GCNs To support a comprehensive evaluation on the effect of RNA-Seq normalization 114
methods and network inference methods on the performance of GCNs maize RNA-Seq datasets were 115
compiled and processed with a computational pipeline (Supplemental Fig 1) 1266 high quality RNA-Seq 116
maize libraries from 17 different experiments were selected as input to an expression matrix The 117
corresponding experimental descriptions and publications where available of each library were manually 118
checked for sample information (Supplemental Table S1) Also a filter for reads depth and alignment rate 119
were used to remove unqualified libraries (see Methods for detail) Tissue type and haplotype from those 120
libraries were manually curated and found to include a range of sample types (Supplemental Table S1) Shoot 121
apical meristem (SAM) leaf and root were the top three most abundant tissue types but a wide range of 122
tissues were represented by multiple libraries in the dataset (Supplemental Fig 1) The dataset also included 123
multiple haplotypes although B73 represented approximately 40 of the included libraries To reduce noise 124
lowly expressed genes were removed from analysis leaving 15116 nonredundant genes across the 1266 125
libraries For comparative purposes the Affymetrix Gene Chip maize array includes 13339 genes before 126
filtering (GeneChip Maize Genome Array 127
httpwwwaffymetrixcomcatalog131468AFFYMaize+Genome+Array1_1) 128
129
Three RNA-Seq Normalization Methods Show Comparable Distribution of Expression 130
Expression data from distinct sources and experiments can be highly variable because of hybridization artifacts 131
in microarray or variable sequencing depth in RNA-Seq Many methods have been successfully used for 132
normalizing both microarray and RNA-Seq data to correct for potential biases (Lim et al 2007 Dillies et al 133
2013b Li et al 2015b) To find an optimal normalization method for building a maize GCN from RNA-Seq data 134
three widely used normalization methods were compared This included Variance Stabilizing Transformed 135
(VST) Counts Per Million (CPM) and Reads Per Killobase Per Million (RPKM) (Mortazavi et al 2008 Anders 136
and Huber 2010 Rau et al 2013) For all normalization methods log2 transformation on the normalized 137
expression values reduced the skew of the data distribution (Supplemental Fig 2) Several network studies 138
from plant RNA-Seq data used log2 transformation (Davidson et al 2011 Ma and Wang 2012 Giorgi et al 139 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 5
2013 Stelpflug et al 2015 Walley et al 2016) In our analysis genes with CPM gt 2 in more than 1000 140
samples were included This filter dramatically reduces zero count values in raw data from 30949 to 0367 141
Moreover a prior count of one was added at log2 normalization (expression = log2(CPMRPKM +1)) to avoid 142
problem with remaining zero values The log2 transformation reduced skewed distributions and extreme values 143
represented by outliers (Supplemental Fig 2) Thus we think it is important to apply log2 transformation for our 144
data 145
The distribution of gene expression across the 1266 libraries formed a bell-shaped curve with a small 146
additional peak of low expression for all three methods (Supplemental Fig 2) To determine if these low 147
expression values came from a few or multiple libraries elements within the range of expression that 148
corresponded to the observed peak (lt -37 CPM Supplemental Fig 2B) were extracted from CPM-normalized 149
expression matrix and matched to the originating libraries This demonstrated that the low expression elements 150
were not limited exclusively to specific libraries but eight libraries contributed over 25 of low elements A 151
gene ontology enrichment analysis failed to identify significant gene ontology descriptors within the subset of 152
43 genes that were defined as lowly expressed (data not shown) All eight of these libraries were from pollen 153
tissue where the average gene expression at 147 Counts Per Million (CPM) is lower than the average gene 154
expression of the other 79 tissues combined at average 183 CPM Hierarchical clustering and correlation 155
heatmap with the same data (Stelpflug et al 2015) shows the uniqueness of pollen tissue expression pattern 156
(Langfelder and Horvath 2008) (Supplemental Fig 3) When the lowly expressed elements from RPKM- and 157
VST-normalized data were analyzed to determine library origin and GO enrichment (data not shown) we found 158
similarly high level of pollen-specific libraries without significant GO categories In pollen some highly 159
expressed genes are considered orphan genes (Wu et al 2014) because they lack detectable homologs in 160
another species To investigate whether these lowly expressed genes were orphan genes their gene 161
sequences were blasted against Setaria italica genome (JGIv2) (BLASTX e-value lt 1E-03) Setaria italic 162
(foxtail millet) is a close relative to maize which diverged 234 million years ago (MYA) as estimated by 163
TimeTree (Kumar et al 2017) Only 1 out of 43 genes lacked detectable homologs in Setaria italic (data not 164
shown) indicating that the majority of these genes are not likely to be orphan genes 165
Because RPKM normalization accounts for gene length the distribution of gene length versus expression for 166
the RPKM method was compared to data normalized by VST and CPM methods VST- and CPM-normalized 167
data showed very similar overall patterns with no clear linear relationship between gene length and average 168
expression (Supplemental Fig 2C) RPKM-normalized data displayed an apparent bias toward elevated 169
expression of a small number of genes less than 5000bp in length and lower expression of long genes 170
suggesting that this normalization method might skew the distribution of expression at some genes Overall in 171
spite of these differences the three normalization methods resulted in a similar distribution of expression 172
patterns for most of the genes included in the analysis Additional analysis was completed to determine if the 173
three normalization methods influence network performance 174
175
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 6
Network Performance Does Not Differ Based Upon Normalization Method 176
To compare the efficacy of three normalization and ten inference methods a GCN was generated for each 177
combination of normalization and inference methods Furthermore all networks were rank-standardized to limit 178
the edge weight ranging from 0 to 1 (See Methods) All networks evaluations used the whole adjacency matrix 179
(1511615116 in RNA-Seq networks 1142911429 or 1786217862 in protein networks) without a cut-off 180
The performance of the different networks was measured by comparing the area under the receiver operator 181
characteristic curves (AUROC) AUROC is a measurement used to evaluate the accuracy of classification 182
models making it suitable for evaluating GCNs (Gillis and Pavlidis 2011 Ma and Wang 2012 Liu et al 2017) 183
AUROC values range from 0 to 1 with a value closer to 1 indicating that the network is discriminating 184
nonrandom patterns and perfect classification random networks returning values close to 05 and values 185
closer to 0 indicating a high degree of incorrect classification While an AUROC value close to 1 is optimal 186
values over 07 suggest good performance when analyzing large diverse networks (Gillis and Pavlidis 2011) 187
To set up the AUROC baseline for the random networks maize gene IDs were shuffled 10 (for MRNET and 188
CLR) or 1000 times (for PCC) from the normalized expression matrix The randomized expression matirx were 189
inferenced using designated alorgrithms and further evaluated The resulting AUROC values from randomized 190
networks were very close to 05 (Supplemental Table S2) 191
AUROC values were calculated and compared for three different network characteristics The first 192
characteristic was designed to test if the network identified genes with known or predicted co-expression 193
patterns based upon prior results and inclusion in two existing datasets that could serve as a positive control 194
for co-expression The maize metabolic pathway (MaizeCyc) contains 413 pathways with more than two genes 195
and was built based upon collection of evidence from genome annotation phylogenetic distance and known 196
genes in maize rice and Arabidopsis (Monaco et al 2013) The maize protein-protein interaction database 197
(PPIM) is based upon both predicted and experimentally detected protein interactions (Zhu et al 2016) and 198
was the second dataset used in this analysis Only high-confident interactions from PPIM were used as 199
defined by ranking top 5 in their model (Zhu et al 2016) For comparison with the GCN genes within the 200
same MaizeCyc or PPIM pathways were considered co-expressed The MaizeCyc and PPIM datasets were 201
combined and genes with less than 5 interactions were excluded from evaluation creating a compiled dataset 202
referred to herein as the Protein-Protein and Pathway dataset (PPPTY) PPPTY had 1720 genes and 104856 203
interactions that were used in this evaluation The AUROC value was calculated for each of the 1720 gene 204
terms 205
To assess the effect of normalization method on GCNs AUROC values for all ten inference methods were 206
averaged for each of the three normalization methods All three normalization methods scored similarly in 207
comparison with the PPPTY dataset (Fig 2B) with a mean AUROC value around 0575 for each suggesting 208
that the predicted networks were more selective than a random network 209
The second characteristic was the presence of similar gene ontology (GO) information for maize genes within 210
a detected co-expression set based upon ldquoguilt by associationrdquo that assumes specific subgroups of co-211 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 7
expressed genes have some shared functions (Wolfe et al 2005) GO annotations were downloaded from 212
AgriGO (Du et al 2010) which uses signature integration by InterPro to map gene IDs to GO terms rather 213
than co-expression data InterPro provided over 108 million stable GO terms to the functional protein 214
information database UniProtKB at release 2016_01(Sangrador-Vegas et al 2016) Thus the GO annotations 215
provide a reliable evaluation resource independent of co-expression data To assess this characteristic gene 216
ontology information was used in a neighbor voting algorithm (Gillis and Pavlidis 2011) for sets of co-217
expression matrices and compared Co-expression matrices were assessed by 3-fold cross-validation which 218
involved masking GO terms from some genes to test whether the masked GO terms could be predicted based 219
upon gene expression patterns 277 GO terms were included for this analysis 220
When GO characteristics were used to assess the networks all three normalization methods performed 221
similarly but the AUROC values were higher at around 0689 for each than those observed for comparisons 222
with PPPTY (Fig 2A) Because GO addresses gene functions and PPPTY emphasizes protein-protein 223
interactions this suggests that GCNs are better at predicting functional interactions than physical interactions 224
The p-value from one-way ANOVA for testing normalization method effect on PPPTY and GO dataset were 225
09535 and 04714 respectively confirming that the normalization method did not create a significant difference 226
in the AUROC scores associated with the GCNs for the characteristics that were tested 227
Finally proteins that regulate gene expression or modify chromatin structure might interact with the DNA of a 228
subset of co-expressed genes The interactions between such a protein and regulated DNA could be detected 229
by chromatin precipitation of associated DNA followed by DNA sequencing (ChIP-Seq) In maize there are five 230
ChIP-Seq datasets available (Bolduc et al 2012 Morohashi et al 2012 Li et al 2015a Pautler et al 2015 231
Yang et al 2016) some of which involving lowly expressed or tissue-specific genes For example Opaque2 is 232
specifically expressed in endosperm (Li et al 2015a) Knotted1 is expressed in SAM and floral tissues (Bolduc 233
et al 2012) and Pericarp Color1 has low expression except in inflorescence and seed (Morohashi et al 234
2012) Histone Deacetylase 101 (HDA101) ChIP-Seq data provided the largest dataset for comparison with 26 235
confirmed binding targets that are relatively high expressed in most maize tissues (Yang et al 2016) Histone 236
deacetylation often correlates with decreased in gene expression (Verdin and Ott 2014) High confidence 237
HDA101 targets were defined as those discovered by ChIP-Seq and that also showed increased gene 238
expression in hda101 mutant Networks associated with the 26 high confidence HDA101 targets were 239
compared by calculating AUROC Based upon this analysis the AUROC values were very similar among 240
networks normalized by VST CPM and RPKM (Fig 2C) which is consistent with GO and PPPTY evaluation 241
242
Correlation Methods Performs better than Mutual Information at Some Genes 243
After normalization of the expression matrices they can be processed by different methods for GCN inference 244
To optimize this step the AUROC values of six correlation (PCC SCC KCC GCC BIC CSC) and four mutual 245
information (MI) methods (AA MA MRNET CLR) were compared for the expression matrices that were 246
generated from each of three normalization methods (VST CPM RPKM) and then averaged In general 247 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 8
correlation methods are more computationally efficient while MI methods are able to reveal non-linear 248
relationships (Li et al 2015c) PCC is widely used but may be influenced by outliers (Mukaka 2012) SCC 249
KCC and BIC are less sensitive to outliers because SCC and KCC only consider the rank information and BIC 250
calculates based on dataset median instead of mean (Serin et al 2016) Recently GCC has been shown to 251
be a better correlation method for gene expression analysis because of its capacity to detect non-linear 252
relationships and insensitivity to outliers (Ma and Wang 2012) CSC is widely used for text mining and 253
analyzing sparse data with many zeros (Dhillon and Modha 2001) ARACNE MRNET and CLR showed 254
extended gene-dependent relationships under variable biological settings (Margolin et al 2006 Faith et al 255
2007 Meyer et al 2007 Li et al 2013b) To estimate the effectiveness of the inference methods the same 256
testing parameters with AUROC calculations were performed as described for the testing of normalization 257
methods 258
Assessed by GO datasets the 277 AUROC values were averaged to create one average value for each of the 259
10 inference methods ranging from 0620 to 0724 (Fig 2D) The average AUROC across all normalization 260
methods for six correlation methods was 0718 while the average AUROC for the all four MI methods was 261
0646 The majority of the 277 GO terms had similar AUROC values in the different correlation method-262
generated GCNs and these patterns are different from those observed in the MI-generated GCNs (Fig 3A) 263
The similarity among different methods was also detectable by pairwise comparison and comparing Pearson 264
correlations between the different methods (Supplemental Fig 4A) 265
To evaluate network inference methods with the PPPTY dataset the AUROC values for 1720 genes were 266
averaged for each combination of normalization and inference methods (Fig 2E) This evaluation also showed 267
that the networks constructed using correlation methods resulted in higher AUROC values than MI methods 268
although the CSC method resulted in lower AUROC values than other correlation methods As demonstrated 269
for the GO evaluation results from correlation methods were more similar with each other than the MI methods 270
(Supplemental Fig 4B) Interestingly heatmap results indicated that a subset of genes consistently had higher 271
AUROC values when CSC MRNETCLR or AAMA were used (Fig 3B) although this includes a small enough 272
number of genes that the average AUROC value over the whole gene set was relatively low for those methods 273
The gene sets with highest AUROC values in PCC CSC or MRNET were extracted Characteristics of each 274
gene sets were compared in average expression (CPM) and average number of low expressed elements 275
(CPM lt 0) The CSC gene set had the smallest number of low expression elements and had higher average 276
expression than both the 1720 gene set and the PCC gene set (Supplemental Fig 5) This may indicate that 277
the CSC method is better at determining co-expression for highly expressed genes 278
The AUROC values from 26 targets of HDA101 ChIP-Seq datasets reveals that CSC GCN had the highest 279
AUROC value and the use of MRNETCLR GCNs resulted in slightly higher scores than correlation methods 280
(Fig 2F) This could be explained by the small number of targets creating skewed results but may also 281
indicate that CSCMI methods are more suitable for specific types of genes or interactions between genes 282
(Tzfadia et al 2016) HDA101 is a highly expressed gene in all samples with average expression value equals 283
to 864 CPM and minimum expression equals to 289 CPM so itrsquos possible that HDA101 is more suitable for 284 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 9
CSC method CPM and RPKM normalization methods had higher AUROC values than VST (Fig 2C) Using 285
two models of ARACNE (additive-AA and multiplicative-MA) the co-expression matrices contain less than 05 286
non-zero values for all comparisons and so these techniques were not included in any additional analyses 287
In conclusion our results indicated the widely-used correlation methods resulted in a more predictive maize 288
GCN from a single expression matrix but co-expression with some individual genes may be better detected 289
using MI methods Normalization method did not have a substantial influence on GCNrsquos performance so only 290
CPM normalization was used in conjunction with PCC SCC MRNT and CLR inference for subsequent 291
optimization of other parameters 292
293
Increase Sample Size Had a Positive Effect On GCN 294
GCN analysis can be accomplished with a variable number of samples and datasets but sample size can 295
influence the quality of the resulting GCN (Wei et al 2004 Ballouz et al 2015) Separate analyses were 296
conducted with different numbers of samples and experiments to empirically determine the effect of sample 297
number on GCN effectiveness The data in our analysis consisted of 17 experiments each including between 298
12 and 404 libraries For this analysis CPM normalization method followed by each of four inference methods 299
(PCC SCC MRNET and CLR) was applied to the 17 experiments and the 68 resulting networks were 300
evaluated by both GO and PPPTY 301
From GO and PPPTY evaluation all algorithms exhibit a positive linear relationship between sample size with 302
natural logarithm transformed and average AUROC values (Fig 4) The linear relationships are stronger in 303
PCC and SCC methods with higher r-square values indicating correlation methods benefit more from 304
increasing sample size Thus for building correlation-based GCNs as many samples as possible should be 305
included We also found that as seen for the total GCN analysis PCC and SCC had higher average AUROC 306
values than the MRNET and CLR methods for PPPTY and GO analysis for most of individual networks (Fig 5) 307
308
Ranked Aggregation of Networks Improved Performance of GCNs 309
Ranked aggregation for meta-analysis can also be modified to change the outcomes of GCN by buffering the 310
effect of sample heterogeneity (Zhong et al 2014 Wang et al 2015a Asnicar et al 2016) Aggregated rank 311
standardized correlationMI matrices were calculated from separate experiments to determine if this approach 312
enhanced GCN performance Aggregating individual networks together for meta-analysis can help to highlight 313
true co-expression interactions and reduce noise (Zhong et al 2014 Wang et al 2015a Wang et al 2015b) 314
This analysis was conducted with the 17 differently sized experiments using PCC SCC MRNET and CLR 315
method for GCN inference as we did previously resulting in 68 single GCNs The 17 experiments were 316
aggregated for PCC SCC MRNET and CLR individually and evaluated by GO and PPPTY datasets 317
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 10
Of the 4 aggregated networks that were evaluated the two correlation methods (PCC and SCC) had higher 318
AUROC values than the single network from 1266 samples (Figure 6 and Supplemental Fig 6) However this 319
aggregation strategy did not result in significant higher AUROC scores for the MRNET and CLR method 320
networks compared with single networks with 1266 samples (two-tail Wilcoxon rank test for GO evaluation p-321
values 0494 and 0796) It has been reported that MI estimation accuracy is dependent on sample size (Gao 322
et al 2015) therefore individual MI networks built with a small number of libraries may not demonstrate 323
improved accuracy from aggregation In conclusion the PCCSCC-built GCN performed best using a ranked 324
aggregation strategy and use of this strategy in combination with the other optimized parameters creates a 325
robust GCN 326
327
The Performance of Protein Networks Did Not Exceed Aggregation Networks 328
In many cases mRNA levels in a cell are of interest because mRNA level is thought to be related to the level 329
and function of a protein of interest However many researchers had found inconsistencies between mRNA 330
and protein level (Baerenfaller et al 2008 Schwanhaumlusser et al 2011 Ponnala et al 2014 Walley et al 331
2016) Although relatively less protein expression data is available this data is amenable to GCN construction 332
and could represent a more direct reflection of interacting proteins Using a non-modified protein expression 333
atlas from 23 maize tissues based upon mass spectrometry data (Walley et al 2016) four protein networks 334
were built with PCC SCC MRNET and CLR separately and then evaluated using the same PPPTY and GO 335
dataset as previously mentioned 336
GCNs constructed from protein expression did not exhibit superior AUROC values to those observed for RNA-337
Seq based GCN using the aggregation strategy (Fig 6) When evaluated by GO and PPPTY dataset the 338
performance of the protein network was lower than the aggregated network as well as the single network from 339
1266 samples To confirm this result a two-way ANOVA was computed with pairwise comparison for the GO 340
evaluation which showed that the effect of network type was significant (Supplemental Table S3) A 341
subsequent pairwise comparison using Wilcoxon rank sum test indicated that PCCSCC method were 342
significantly better than MRNETCLR (Supplemental Table S3) although MI methods may be superior for 343
some types of interactions 344
The raw protein expression data included 17862 genes of which 11429 genes overlapped with our RNA-Seq-345
based network and were therefore used for the analysis To demonstrate that the performance of the protein 346
network was not biased due by the selection of genes the PCC method was used for the whole 17862 genes 347
to construct a protein network (Supplemental Fig 7) No improvement could be detected from protein network 348
derived from 17862 genes with p-value equals to 0635 for GO evaluation and 0995 for PPPTY evaluation 349
from one-sided Wilcoxon rank sum test 350
351
PCC and SCC-built GCN Exhibit Identical Topological and Functional Properties 352 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 11
In addition to evaluation of network performance based upon biological characteristics networks can be 353
compared based upon several different network characteristics including clustering coefficient number of 354
nodes network heterogeneity (Dong and Horvath 2007) network centralization (Dong and Horvath 2007) 355
number of detected modules and number of genes in largest module Number of nodes is a basic construct in 356
graph theory depicting the scale of a network Clustering coefficient and number of modules are to model how 357
densely nodes are connected in networks Heterogeneity measures the variability of node connections 358
Centralization indicates how likely some nodes have significantly more connections than average In this 359
analysis each gene corresponds with a node Based on the extensive evaluation using biological 360
characteristics like protein-protein interactions (PPPTY) and predicted gene function (GO) three final maize 361
networks were selected for comparison of basic network characteristics based on their overall performance 362
PCC and SCC-built ranked aggregation network from 17 experiments (PA and SA) MRNET-built single 363
network from 1266 total samples (MS) The three networks were constrained to include the top one million 364
predicted interactions or edges 365
In prior studies most biological networks had scale-free architectures which fit a power-law distribution 366
(Barabasi et al 2004 Doncheva et al 2012 Schaefer et al 2014) For the three final maize networks 367
constructed using optimized parameters both neighborhood connectivity distribution (Supplemental Fig 8) and 368
node degree distribution (Supplemental Fig 9) fit power-law models with r-squared values over 07 The MS 369
network had the highest network centralization value The network heterogeneity value of MS was over two 370
times that of PA and SA indicating that MS may contain more highly interacting genes (Supplemental Table 371
S4) consistent with the observed highest centralization values for this network Centralization and 372
heterogeneity are two variants to model the degree distribution of networks A scale-free network with more 373
numbers of hubs has larger values of centralization and heterogeneity while a network with larger values of 374
centralization and heterogeneity may contain a larger number of hubs or the number of hubs is not significantly 375
large but the degree distributions are extremely imbalanced In biological networks many observations 376
connected large values of centralization and heterogeneity with more hub genes (Ma and Zeng 2003 Horvath 377
and Dong 2008 Iancu et al 2012 Scott-Boyer et al 2013) even though theoretically we cannot rule out the 378
possibility that high values were result from extremely imbalanced degree distribution For the MS network 379
most highly connected genes interacted with a large number of lowly connected genes this pattern is also 380
apparent reflected in the decreasing neighborhood connectivity distribution for the MS network (Supplemental 381
Fig 8) The genes with the most interactions are expected to act as key components in GCN networks 382
(Langfelder and Horvath 2008 Allen et al 2012) and likely represent central regulators of multi-protein 383
biological processes (Ma et al 2013 Du et al 2015) The top 1000 interacting genes from all networks were 384
analyzed in more detail as these were potential ldquohubrdquo genes that may regulate other expression patterns and 385
processes PA and SA shared 95 of the top 1000 interacting genes while MS had 835 unique genes (Fig 386
7A) 148 genes were shared among all three networks (Supplemental Table S5) making these genes strong 387
candidate for central biological regulators The annotation of these genes suggests their participation in a 388
range of basic cellular process (Fig 7C) including gene expression DNA replication translation and gene 389
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 4
publicly accessible at httpwwwbiofsuedumcginnislabmcnmain_pagephp A tutorial is also provided as 105
supplemental material 106
107
Results 108
Manually Curated Maize mRNA Expression Profiling from Publicly Available Datasets 109
Recently the usage of RNA-Seq in maize has increased dramatically generating zero data entries in NCBI-110
SRA in 2008 to over nine hundred in 2016 (Fig 1B) On the contrary the most widely used Affymetrix 111
expression array for maize had 177 samples in 2008 but only 46 in 2016 (Fig 1B) GCN construction 112
approaches have not been optimized for RNA-Seq datasets in plants and doing so could improve the quality 113
and robustness of GCNs To support a comprehensive evaluation on the effect of RNA-Seq normalization 114
methods and network inference methods on the performance of GCNs maize RNA-Seq datasets were 115
compiled and processed with a computational pipeline (Supplemental Fig 1) 1266 high quality RNA-Seq 116
maize libraries from 17 different experiments were selected as input to an expression matrix The 117
corresponding experimental descriptions and publications where available of each library were manually 118
checked for sample information (Supplemental Table S1) Also a filter for reads depth and alignment rate 119
were used to remove unqualified libraries (see Methods for detail) Tissue type and haplotype from those 120
libraries were manually curated and found to include a range of sample types (Supplemental Table S1) Shoot 121
apical meristem (SAM) leaf and root were the top three most abundant tissue types but a wide range of 122
tissues were represented by multiple libraries in the dataset (Supplemental Fig 1) The dataset also included 123
multiple haplotypes although B73 represented approximately 40 of the included libraries To reduce noise 124
lowly expressed genes were removed from analysis leaving 15116 nonredundant genes across the 1266 125
libraries For comparative purposes the Affymetrix Gene Chip maize array includes 13339 genes before 126
filtering (GeneChip Maize Genome Array 127
httpwwwaffymetrixcomcatalog131468AFFYMaize+Genome+Array1_1) 128
129
Three RNA-Seq Normalization Methods Show Comparable Distribution of Expression 130
Expression data from distinct sources and experiments can be highly variable because of hybridization artifacts 131
in microarray or variable sequencing depth in RNA-Seq Many methods have been successfully used for 132
normalizing both microarray and RNA-Seq data to correct for potential biases (Lim et al 2007 Dillies et al 133
2013b Li et al 2015b) To find an optimal normalization method for building a maize GCN from RNA-Seq data 134
three widely used normalization methods were compared This included Variance Stabilizing Transformed 135
(VST) Counts Per Million (CPM) and Reads Per Killobase Per Million (RPKM) (Mortazavi et al 2008 Anders 136
and Huber 2010 Rau et al 2013) For all normalization methods log2 transformation on the normalized 137
expression values reduced the skew of the data distribution (Supplemental Fig 2) Several network studies 138
from plant RNA-Seq data used log2 transformation (Davidson et al 2011 Ma and Wang 2012 Giorgi et al 139 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 5
2013 Stelpflug et al 2015 Walley et al 2016) In our analysis genes with CPM gt 2 in more than 1000 140
samples were included This filter dramatically reduces zero count values in raw data from 30949 to 0367 141
Moreover a prior count of one was added at log2 normalization (expression = log2(CPMRPKM +1)) to avoid 142
problem with remaining zero values The log2 transformation reduced skewed distributions and extreme values 143
represented by outliers (Supplemental Fig 2) Thus we think it is important to apply log2 transformation for our 144
data 145
The distribution of gene expression across the 1266 libraries formed a bell-shaped curve with a small 146
additional peak of low expression for all three methods (Supplemental Fig 2) To determine if these low 147
expression values came from a few or multiple libraries elements within the range of expression that 148
corresponded to the observed peak (lt -37 CPM Supplemental Fig 2B) were extracted from CPM-normalized 149
expression matrix and matched to the originating libraries This demonstrated that the low expression elements 150
were not limited exclusively to specific libraries but eight libraries contributed over 25 of low elements A 151
gene ontology enrichment analysis failed to identify significant gene ontology descriptors within the subset of 152
43 genes that were defined as lowly expressed (data not shown) All eight of these libraries were from pollen 153
tissue where the average gene expression at 147 Counts Per Million (CPM) is lower than the average gene 154
expression of the other 79 tissues combined at average 183 CPM Hierarchical clustering and correlation 155
heatmap with the same data (Stelpflug et al 2015) shows the uniqueness of pollen tissue expression pattern 156
(Langfelder and Horvath 2008) (Supplemental Fig 3) When the lowly expressed elements from RPKM- and 157
VST-normalized data were analyzed to determine library origin and GO enrichment (data not shown) we found 158
similarly high level of pollen-specific libraries without significant GO categories In pollen some highly 159
expressed genes are considered orphan genes (Wu et al 2014) because they lack detectable homologs in 160
another species To investigate whether these lowly expressed genes were orphan genes their gene 161
sequences were blasted against Setaria italica genome (JGIv2) (BLASTX e-value lt 1E-03) Setaria italic 162
(foxtail millet) is a close relative to maize which diverged 234 million years ago (MYA) as estimated by 163
TimeTree (Kumar et al 2017) Only 1 out of 43 genes lacked detectable homologs in Setaria italic (data not 164
shown) indicating that the majority of these genes are not likely to be orphan genes 165
Because RPKM normalization accounts for gene length the distribution of gene length versus expression for 166
the RPKM method was compared to data normalized by VST and CPM methods VST- and CPM-normalized 167
data showed very similar overall patterns with no clear linear relationship between gene length and average 168
expression (Supplemental Fig 2C) RPKM-normalized data displayed an apparent bias toward elevated 169
expression of a small number of genes less than 5000bp in length and lower expression of long genes 170
suggesting that this normalization method might skew the distribution of expression at some genes Overall in 171
spite of these differences the three normalization methods resulted in a similar distribution of expression 172
patterns for most of the genes included in the analysis Additional analysis was completed to determine if the 173
three normalization methods influence network performance 174
175
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 6
Network Performance Does Not Differ Based Upon Normalization Method 176
To compare the efficacy of three normalization and ten inference methods a GCN was generated for each 177
combination of normalization and inference methods Furthermore all networks were rank-standardized to limit 178
the edge weight ranging from 0 to 1 (See Methods) All networks evaluations used the whole adjacency matrix 179
(1511615116 in RNA-Seq networks 1142911429 or 1786217862 in protein networks) without a cut-off 180
The performance of the different networks was measured by comparing the area under the receiver operator 181
characteristic curves (AUROC) AUROC is a measurement used to evaluate the accuracy of classification 182
models making it suitable for evaluating GCNs (Gillis and Pavlidis 2011 Ma and Wang 2012 Liu et al 2017) 183
AUROC values range from 0 to 1 with a value closer to 1 indicating that the network is discriminating 184
nonrandom patterns and perfect classification random networks returning values close to 05 and values 185
closer to 0 indicating a high degree of incorrect classification While an AUROC value close to 1 is optimal 186
values over 07 suggest good performance when analyzing large diverse networks (Gillis and Pavlidis 2011) 187
To set up the AUROC baseline for the random networks maize gene IDs were shuffled 10 (for MRNET and 188
CLR) or 1000 times (for PCC) from the normalized expression matrix The randomized expression matirx were 189
inferenced using designated alorgrithms and further evaluated The resulting AUROC values from randomized 190
networks were very close to 05 (Supplemental Table S2) 191
AUROC values were calculated and compared for three different network characteristics The first 192
characteristic was designed to test if the network identified genes with known or predicted co-expression 193
patterns based upon prior results and inclusion in two existing datasets that could serve as a positive control 194
for co-expression The maize metabolic pathway (MaizeCyc) contains 413 pathways with more than two genes 195
and was built based upon collection of evidence from genome annotation phylogenetic distance and known 196
genes in maize rice and Arabidopsis (Monaco et al 2013) The maize protein-protein interaction database 197
(PPIM) is based upon both predicted and experimentally detected protein interactions (Zhu et al 2016) and 198
was the second dataset used in this analysis Only high-confident interactions from PPIM were used as 199
defined by ranking top 5 in their model (Zhu et al 2016) For comparison with the GCN genes within the 200
same MaizeCyc or PPIM pathways were considered co-expressed The MaizeCyc and PPIM datasets were 201
combined and genes with less than 5 interactions were excluded from evaluation creating a compiled dataset 202
referred to herein as the Protein-Protein and Pathway dataset (PPPTY) PPPTY had 1720 genes and 104856 203
interactions that were used in this evaluation The AUROC value was calculated for each of the 1720 gene 204
terms 205
To assess the effect of normalization method on GCNs AUROC values for all ten inference methods were 206
averaged for each of the three normalization methods All three normalization methods scored similarly in 207
comparison with the PPPTY dataset (Fig 2B) with a mean AUROC value around 0575 for each suggesting 208
that the predicted networks were more selective than a random network 209
The second characteristic was the presence of similar gene ontology (GO) information for maize genes within 210
a detected co-expression set based upon ldquoguilt by associationrdquo that assumes specific subgroups of co-211 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 7
expressed genes have some shared functions (Wolfe et al 2005) GO annotations were downloaded from 212
AgriGO (Du et al 2010) which uses signature integration by InterPro to map gene IDs to GO terms rather 213
than co-expression data InterPro provided over 108 million stable GO terms to the functional protein 214
information database UniProtKB at release 2016_01(Sangrador-Vegas et al 2016) Thus the GO annotations 215
provide a reliable evaluation resource independent of co-expression data To assess this characteristic gene 216
ontology information was used in a neighbor voting algorithm (Gillis and Pavlidis 2011) for sets of co-217
expression matrices and compared Co-expression matrices were assessed by 3-fold cross-validation which 218
involved masking GO terms from some genes to test whether the masked GO terms could be predicted based 219
upon gene expression patterns 277 GO terms were included for this analysis 220
When GO characteristics were used to assess the networks all three normalization methods performed 221
similarly but the AUROC values were higher at around 0689 for each than those observed for comparisons 222
with PPPTY (Fig 2A) Because GO addresses gene functions and PPPTY emphasizes protein-protein 223
interactions this suggests that GCNs are better at predicting functional interactions than physical interactions 224
The p-value from one-way ANOVA for testing normalization method effect on PPPTY and GO dataset were 225
09535 and 04714 respectively confirming that the normalization method did not create a significant difference 226
in the AUROC scores associated with the GCNs for the characteristics that were tested 227
Finally proteins that regulate gene expression or modify chromatin structure might interact with the DNA of a 228
subset of co-expressed genes The interactions between such a protein and regulated DNA could be detected 229
by chromatin precipitation of associated DNA followed by DNA sequencing (ChIP-Seq) In maize there are five 230
ChIP-Seq datasets available (Bolduc et al 2012 Morohashi et al 2012 Li et al 2015a Pautler et al 2015 231
Yang et al 2016) some of which involving lowly expressed or tissue-specific genes For example Opaque2 is 232
specifically expressed in endosperm (Li et al 2015a) Knotted1 is expressed in SAM and floral tissues (Bolduc 233
et al 2012) and Pericarp Color1 has low expression except in inflorescence and seed (Morohashi et al 234
2012) Histone Deacetylase 101 (HDA101) ChIP-Seq data provided the largest dataset for comparison with 26 235
confirmed binding targets that are relatively high expressed in most maize tissues (Yang et al 2016) Histone 236
deacetylation often correlates with decreased in gene expression (Verdin and Ott 2014) High confidence 237
HDA101 targets were defined as those discovered by ChIP-Seq and that also showed increased gene 238
expression in hda101 mutant Networks associated with the 26 high confidence HDA101 targets were 239
compared by calculating AUROC Based upon this analysis the AUROC values were very similar among 240
networks normalized by VST CPM and RPKM (Fig 2C) which is consistent with GO and PPPTY evaluation 241
242
Correlation Methods Performs better than Mutual Information at Some Genes 243
After normalization of the expression matrices they can be processed by different methods for GCN inference 244
To optimize this step the AUROC values of six correlation (PCC SCC KCC GCC BIC CSC) and four mutual 245
information (MI) methods (AA MA MRNET CLR) were compared for the expression matrices that were 246
generated from each of three normalization methods (VST CPM RPKM) and then averaged In general 247 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 8
correlation methods are more computationally efficient while MI methods are able to reveal non-linear 248
relationships (Li et al 2015c) PCC is widely used but may be influenced by outliers (Mukaka 2012) SCC 249
KCC and BIC are less sensitive to outliers because SCC and KCC only consider the rank information and BIC 250
calculates based on dataset median instead of mean (Serin et al 2016) Recently GCC has been shown to 251
be a better correlation method for gene expression analysis because of its capacity to detect non-linear 252
relationships and insensitivity to outliers (Ma and Wang 2012) CSC is widely used for text mining and 253
analyzing sparse data with many zeros (Dhillon and Modha 2001) ARACNE MRNET and CLR showed 254
extended gene-dependent relationships under variable biological settings (Margolin et al 2006 Faith et al 255
2007 Meyer et al 2007 Li et al 2013b) To estimate the effectiveness of the inference methods the same 256
testing parameters with AUROC calculations were performed as described for the testing of normalization 257
methods 258
Assessed by GO datasets the 277 AUROC values were averaged to create one average value for each of the 259
10 inference methods ranging from 0620 to 0724 (Fig 2D) The average AUROC across all normalization 260
methods for six correlation methods was 0718 while the average AUROC for the all four MI methods was 261
0646 The majority of the 277 GO terms had similar AUROC values in the different correlation method-262
generated GCNs and these patterns are different from those observed in the MI-generated GCNs (Fig 3A) 263
The similarity among different methods was also detectable by pairwise comparison and comparing Pearson 264
correlations between the different methods (Supplemental Fig 4A) 265
To evaluate network inference methods with the PPPTY dataset the AUROC values for 1720 genes were 266
averaged for each combination of normalization and inference methods (Fig 2E) This evaluation also showed 267
that the networks constructed using correlation methods resulted in higher AUROC values than MI methods 268
although the CSC method resulted in lower AUROC values than other correlation methods As demonstrated 269
for the GO evaluation results from correlation methods were more similar with each other than the MI methods 270
(Supplemental Fig 4B) Interestingly heatmap results indicated that a subset of genes consistently had higher 271
AUROC values when CSC MRNETCLR or AAMA were used (Fig 3B) although this includes a small enough 272
number of genes that the average AUROC value over the whole gene set was relatively low for those methods 273
The gene sets with highest AUROC values in PCC CSC or MRNET were extracted Characteristics of each 274
gene sets were compared in average expression (CPM) and average number of low expressed elements 275
(CPM lt 0) The CSC gene set had the smallest number of low expression elements and had higher average 276
expression than both the 1720 gene set and the PCC gene set (Supplemental Fig 5) This may indicate that 277
the CSC method is better at determining co-expression for highly expressed genes 278
The AUROC values from 26 targets of HDA101 ChIP-Seq datasets reveals that CSC GCN had the highest 279
AUROC value and the use of MRNETCLR GCNs resulted in slightly higher scores than correlation methods 280
(Fig 2F) This could be explained by the small number of targets creating skewed results but may also 281
indicate that CSCMI methods are more suitable for specific types of genes or interactions between genes 282
(Tzfadia et al 2016) HDA101 is a highly expressed gene in all samples with average expression value equals 283
to 864 CPM and minimum expression equals to 289 CPM so itrsquos possible that HDA101 is more suitable for 284 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 9
CSC method CPM and RPKM normalization methods had higher AUROC values than VST (Fig 2C) Using 285
two models of ARACNE (additive-AA and multiplicative-MA) the co-expression matrices contain less than 05 286
non-zero values for all comparisons and so these techniques were not included in any additional analyses 287
In conclusion our results indicated the widely-used correlation methods resulted in a more predictive maize 288
GCN from a single expression matrix but co-expression with some individual genes may be better detected 289
using MI methods Normalization method did not have a substantial influence on GCNrsquos performance so only 290
CPM normalization was used in conjunction with PCC SCC MRNT and CLR inference for subsequent 291
optimization of other parameters 292
293
Increase Sample Size Had a Positive Effect On GCN 294
GCN analysis can be accomplished with a variable number of samples and datasets but sample size can 295
influence the quality of the resulting GCN (Wei et al 2004 Ballouz et al 2015) Separate analyses were 296
conducted with different numbers of samples and experiments to empirically determine the effect of sample 297
number on GCN effectiveness The data in our analysis consisted of 17 experiments each including between 298
12 and 404 libraries For this analysis CPM normalization method followed by each of four inference methods 299
(PCC SCC MRNET and CLR) was applied to the 17 experiments and the 68 resulting networks were 300
evaluated by both GO and PPPTY 301
From GO and PPPTY evaluation all algorithms exhibit a positive linear relationship between sample size with 302
natural logarithm transformed and average AUROC values (Fig 4) The linear relationships are stronger in 303
PCC and SCC methods with higher r-square values indicating correlation methods benefit more from 304
increasing sample size Thus for building correlation-based GCNs as many samples as possible should be 305
included We also found that as seen for the total GCN analysis PCC and SCC had higher average AUROC 306
values than the MRNET and CLR methods for PPPTY and GO analysis for most of individual networks (Fig 5) 307
308
Ranked Aggregation of Networks Improved Performance of GCNs 309
Ranked aggregation for meta-analysis can also be modified to change the outcomes of GCN by buffering the 310
effect of sample heterogeneity (Zhong et al 2014 Wang et al 2015a Asnicar et al 2016) Aggregated rank 311
standardized correlationMI matrices were calculated from separate experiments to determine if this approach 312
enhanced GCN performance Aggregating individual networks together for meta-analysis can help to highlight 313
true co-expression interactions and reduce noise (Zhong et al 2014 Wang et al 2015a Wang et al 2015b) 314
This analysis was conducted with the 17 differently sized experiments using PCC SCC MRNET and CLR 315
method for GCN inference as we did previously resulting in 68 single GCNs The 17 experiments were 316
aggregated for PCC SCC MRNET and CLR individually and evaluated by GO and PPPTY datasets 317
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 10
Of the 4 aggregated networks that were evaluated the two correlation methods (PCC and SCC) had higher 318
AUROC values than the single network from 1266 samples (Figure 6 and Supplemental Fig 6) However this 319
aggregation strategy did not result in significant higher AUROC scores for the MRNET and CLR method 320
networks compared with single networks with 1266 samples (two-tail Wilcoxon rank test for GO evaluation p-321
values 0494 and 0796) It has been reported that MI estimation accuracy is dependent on sample size (Gao 322
et al 2015) therefore individual MI networks built with a small number of libraries may not demonstrate 323
improved accuracy from aggregation In conclusion the PCCSCC-built GCN performed best using a ranked 324
aggregation strategy and use of this strategy in combination with the other optimized parameters creates a 325
robust GCN 326
327
The Performance of Protein Networks Did Not Exceed Aggregation Networks 328
In many cases mRNA levels in a cell are of interest because mRNA level is thought to be related to the level 329
and function of a protein of interest However many researchers had found inconsistencies between mRNA 330
and protein level (Baerenfaller et al 2008 Schwanhaumlusser et al 2011 Ponnala et al 2014 Walley et al 331
2016) Although relatively less protein expression data is available this data is amenable to GCN construction 332
and could represent a more direct reflection of interacting proteins Using a non-modified protein expression 333
atlas from 23 maize tissues based upon mass spectrometry data (Walley et al 2016) four protein networks 334
were built with PCC SCC MRNET and CLR separately and then evaluated using the same PPPTY and GO 335
dataset as previously mentioned 336
GCNs constructed from protein expression did not exhibit superior AUROC values to those observed for RNA-337
Seq based GCN using the aggregation strategy (Fig 6) When evaluated by GO and PPPTY dataset the 338
performance of the protein network was lower than the aggregated network as well as the single network from 339
1266 samples To confirm this result a two-way ANOVA was computed with pairwise comparison for the GO 340
evaluation which showed that the effect of network type was significant (Supplemental Table S3) A 341
subsequent pairwise comparison using Wilcoxon rank sum test indicated that PCCSCC method were 342
significantly better than MRNETCLR (Supplemental Table S3) although MI methods may be superior for 343
some types of interactions 344
The raw protein expression data included 17862 genes of which 11429 genes overlapped with our RNA-Seq-345
based network and were therefore used for the analysis To demonstrate that the performance of the protein 346
network was not biased due by the selection of genes the PCC method was used for the whole 17862 genes 347
to construct a protein network (Supplemental Fig 7) No improvement could be detected from protein network 348
derived from 17862 genes with p-value equals to 0635 for GO evaluation and 0995 for PPPTY evaluation 349
from one-sided Wilcoxon rank sum test 350
351
PCC and SCC-built GCN Exhibit Identical Topological and Functional Properties 352 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 11
In addition to evaluation of network performance based upon biological characteristics networks can be 353
compared based upon several different network characteristics including clustering coefficient number of 354
nodes network heterogeneity (Dong and Horvath 2007) network centralization (Dong and Horvath 2007) 355
number of detected modules and number of genes in largest module Number of nodes is a basic construct in 356
graph theory depicting the scale of a network Clustering coefficient and number of modules are to model how 357
densely nodes are connected in networks Heterogeneity measures the variability of node connections 358
Centralization indicates how likely some nodes have significantly more connections than average In this 359
analysis each gene corresponds with a node Based on the extensive evaluation using biological 360
characteristics like protein-protein interactions (PPPTY) and predicted gene function (GO) three final maize 361
networks were selected for comparison of basic network characteristics based on their overall performance 362
PCC and SCC-built ranked aggregation network from 17 experiments (PA and SA) MRNET-built single 363
network from 1266 total samples (MS) The three networks were constrained to include the top one million 364
predicted interactions or edges 365
In prior studies most biological networks had scale-free architectures which fit a power-law distribution 366
(Barabasi et al 2004 Doncheva et al 2012 Schaefer et al 2014) For the three final maize networks 367
constructed using optimized parameters both neighborhood connectivity distribution (Supplemental Fig 8) and 368
node degree distribution (Supplemental Fig 9) fit power-law models with r-squared values over 07 The MS 369
network had the highest network centralization value The network heterogeneity value of MS was over two 370
times that of PA and SA indicating that MS may contain more highly interacting genes (Supplemental Table 371
S4) consistent with the observed highest centralization values for this network Centralization and 372
heterogeneity are two variants to model the degree distribution of networks A scale-free network with more 373
numbers of hubs has larger values of centralization and heterogeneity while a network with larger values of 374
centralization and heterogeneity may contain a larger number of hubs or the number of hubs is not significantly 375
large but the degree distributions are extremely imbalanced In biological networks many observations 376
connected large values of centralization and heterogeneity with more hub genes (Ma and Zeng 2003 Horvath 377
and Dong 2008 Iancu et al 2012 Scott-Boyer et al 2013) even though theoretically we cannot rule out the 378
possibility that high values were result from extremely imbalanced degree distribution For the MS network 379
most highly connected genes interacted with a large number of lowly connected genes this pattern is also 380
apparent reflected in the decreasing neighborhood connectivity distribution for the MS network (Supplemental 381
Fig 8) The genes with the most interactions are expected to act as key components in GCN networks 382
(Langfelder and Horvath 2008 Allen et al 2012) and likely represent central regulators of multi-protein 383
biological processes (Ma et al 2013 Du et al 2015) The top 1000 interacting genes from all networks were 384
analyzed in more detail as these were potential ldquohubrdquo genes that may regulate other expression patterns and 385
processes PA and SA shared 95 of the top 1000 interacting genes while MS had 835 unique genes (Fig 386
7A) 148 genes were shared among all three networks (Supplemental Table S5) making these genes strong 387
candidate for central biological regulators The annotation of these genes suggests their participation in a 388
range of basic cellular process (Fig 7C) including gene expression DNA replication translation and gene 389
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 5
2013 Stelpflug et al 2015 Walley et al 2016) In our analysis genes with CPM gt 2 in more than 1000 140
samples were included This filter dramatically reduces zero count values in raw data from 30949 to 0367 141
Moreover a prior count of one was added at log2 normalization (expression = log2(CPMRPKM +1)) to avoid 142
problem with remaining zero values The log2 transformation reduced skewed distributions and extreme values 143
represented by outliers (Supplemental Fig 2) Thus we think it is important to apply log2 transformation for our 144
data 145
The distribution of gene expression across the 1266 libraries formed a bell-shaped curve with a small 146
additional peak of low expression for all three methods (Supplemental Fig 2) To determine if these low 147
expression values came from a few or multiple libraries elements within the range of expression that 148
corresponded to the observed peak (lt -37 CPM Supplemental Fig 2B) were extracted from CPM-normalized 149
expression matrix and matched to the originating libraries This demonstrated that the low expression elements 150
were not limited exclusively to specific libraries but eight libraries contributed over 25 of low elements A 151
gene ontology enrichment analysis failed to identify significant gene ontology descriptors within the subset of 152
43 genes that were defined as lowly expressed (data not shown) All eight of these libraries were from pollen 153
tissue where the average gene expression at 147 Counts Per Million (CPM) is lower than the average gene 154
expression of the other 79 tissues combined at average 183 CPM Hierarchical clustering and correlation 155
heatmap with the same data (Stelpflug et al 2015) shows the uniqueness of pollen tissue expression pattern 156
(Langfelder and Horvath 2008) (Supplemental Fig 3) When the lowly expressed elements from RPKM- and 157
VST-normalized data were analyzed to determine library origin and GO enrichment (data not shown) we found 158
similarly high level of pollen-specific libraries without significant GO categories In pollen some highly 159
expressed genes are considered orphan genes (Wu et al 2014) because they lack detectable homologs in 160
another species To investigate whether these lowly expressed genes were orphan genes their gene 161
sequences were blasted against Setaria italica genome (JGIv2) (BLASTX e-value lt 1E-03) Setaria italic 162
(foxtail millet) is a close relative to maize which diverged 234 million years ago (MYA) as estimated by 163
TimeTree (Kumar et al 2017) Only 1 out of 43 genes lacked detectable homologs in Setaria italic (data not 164
shown) indicating that the majority of these genes are not likely to be orphan genes 165
Because RPKM normalization accounts for gene length the distribution of gene length versus expression for 166
the RPKM method was compared to data normalized by VST and CPM methods VST- and CPM-normalized 167
data showed very similar overall patterns with no clear linear relationship between gene length and average 168
expression (Supplemental Fig 2C) RPKM-normalized data displayed an apparent bias toward elevated 169
expression of a small number of genes less than 5000bp in length and lower expression of long genes 170
suggesting that this normalization method might skew the distribution of expression at some genes Overall in 171
spite of these differences the three normalization methods resulted in a similar distribution of expression 172
patterns for most of the genes included in the analysis Additional analysis was completed to determine if the 173
three normalization methods influence network performance 174
175
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 6
Network Performance Does Not Differ Based Upon Normalization Method 176
To compare the efficacy of three normalization and ten inference methods a GCN was generated for each 177
combination of normalization and inference methods Furthermore all networks were rank-standardized to limit 178
the edge weight ranging from 0 to 1 (See Methods) All networks evaluations used the whole adjacency matrix 179
(1511615116 in RNA-Seq networks 1142911429 or 1786217862 in protein networks) without a cut-off 180
The performance of the different networks was measured by comparing the area under the receiver operator 181
characteristic curves (AUROC) AUROC is a measurement used to evaluate the accuracy of classification 182
models making it suitable for evaluating GCNs (Gillis and Pavlidis 2011 Ma and Wang 2012 Liu et al 2017) 183
AUROC values range from 0 to 1 with a value closer to 1 indicating that the network is discriminating 184
nonrandom patterns and perfect classification random networks returning values close to 05 and values 185
closer to 0 indicating a high degree of incorrect classification While an AUROC value close to 1 is optimal 186
values over 07 suggest good performance when analyzing large diverse networks (Gillis and Pavlidis 2011) 187
To set up the AUROC baseline for the random networks maize gene IDs were shuffled 10 (for MRNET and 188
CLR) or 1000 times (for PCC) from the normalized expression matrix The randomized expression matirx were 189
inferenced using designated alorgrithms and further evaluated The resulting AUROC values from randomized 190
networks were very close to 05 (Supplemental Table S2) 191
AUROC values were calculated and compared for three different network characteristics The first 192
characteristic was designed to test if the network identified genes with known or predicted co-expression 193
patterns based upon prior results and inclusion in two existing datasets that could serve as a positive control 194
for co-expression The maize metabolic pathway (MaizeCyc) contains 413 pathways with more than two genes 195
and was built based upon collection of evidence from genome annotation phylogenetic distance and known 196
genes in maize rice and Arabidopsis (Monaco et al 2013) The maize protein-protein interaction database 197
(PPIM) is based upon both predicted and experimentally detected protein interactions (Zhu et al 2016) and 198
was the second dataset used in this analysis Only high-confident interactions from PPIM were used as 199
defined by ranking top 5 in their model (Zhu et al 2016) For comparison with the GCN genes within the 200
same MaizeCyc or PPIM pathways were considered co-expressed The MaizeCyc and PPIM datasets were 201
combined and genes with less than 5 interactions were excluded from evaluation creating a compiled dataset 202
referred to herein as the Protein-Protein and Pathway dataset (PPPTY) PPPTY had 1720 genes and 104856 203
interactions that were used in this evaluation The AUROC value was calculated for each of the 1720 gene 204
terms 205
To assess the effect of normalization method on GCNs AUROC values for all ten inference methods were 206
averaged for each of the three normalization methods All three normalization methods scored similarly in 207
comparison with the PPPTY dataset (Fig 2B) with a mean AUROC value around 0575 for each suggesting 208
that the predicted networks were more selective than a random network 209
The second characteristic was the presence of similar gene ontology (GO) information for maize genes within 210
a detected co-expression set based upon ldquoguilt by associationrdquo that assumes specific subgroups of co-211 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 7
expressed genes have some shared functions (Wolfe et al 2005) GO annotations were downloaded from 212
AgriGO (Du et al 2010) which uses signature integration by InterPro to map gene IDs to GO terms rather 213
than co-expression data InterPro provided over 108 million stable GO terms to the functional protein 214
information database UniProtKB at release 2016_01(Sangrador-Vegas et al 2016) Thus the GO annotations 215
provide a reliable evaluation resource independent of co-expression data To assess this characteristic gene 216
ontology information was used in a neighbor voting algorithm (Gillis and Pavlidis 2011) for sets of co-217
expression matrices and compared Co-expression matrices were assessed by 3-fold cross-validation which 218
involved masking GO terms from some genes to test whether the masked GO terms could be predicted based 219
upon gene expression patterns 277 GO terms were included for this analysis 220
When GO characteristics were used to assess the networks all three normalization methods performed 221
similarly but the AUROC values were higher at around 0689 for each than those observed for comparisons 222
with PPPTY (Fig 2A) Because GO addresses gene functions and PPPTY emphasizes protein-protein 223
interactions this suggests that GCNs are better at predicting functional interactions than physical interactions 224
The p-value from one-way ANOVA for testing normalization method effect on PPPTY and GO dataset were 225
09535 and 04714 respectively confirming that the normalization method did not create a significant difference 226
in the AUROC scores associated with the GCNs for the characteristics that were tested 227
Finally proteins that regulate gene expression or modify chromatin structure might interact with the DNA of a 228
subset of co-expressed genes The interactions between such a protein and regulated DNA could be detected 229
by chromatin precipitation of associated DNA followed by DNA sequencing (ChIP-Seq) In maize there are five 230
ChIP-Seq datasets available (Bolduc et al 2012 Morohashi et al 2012 Li et al 2015a Pautler et al 2015 231
Yang et al 2016) some of which involving lowly expressed or tissue-specific genes For example Opaque2 is 232
specifically expressed in endosperm (Li et al 2015a) Knotted1 is expressed in SAM and floral tissues (Bolduc 233
et al 2012) and Pericarp Color1 has low expression except in inflorescence and seed (Morohashi et al 234
2012) Histone Deacetylase 101 (HDA101) ChIP-Seq data provided the largest dataset for comparison with 26 235
confirmed binding targets that are relatively high expressed in most maize tissues (Yang et al 2016) Histone 236
deacetylation often correlates with decreased in gene expression (Verdin and Ott 2014) High confidence 237
HDA101 targets were defined as those discovered by ChIP-Seq and that also showed increased gene 238
expression in hda101 mutant Networks associated with the 26 high confidence HDA101 targets were 239
compared by calculating AUROC Based upon this analysis the AUROC values were very similar among 240
networks normalized by VST CPM and RPKM (Fig 2C) which is consistent with GO and PPPTY evaluation 241
242
Correlation Methods Performs better than Mutual Information at Some Genes 243
After normalization of the expression matrices they can be processed by different methods for GCN inference 244
To optimize this step the AUROC values of six correlation (PCC SCC KCC GCC BIC CSC) and four mutual 245
information (MI) methods (AA MA MRNET CLR) were compared for the expression matrices that were 246
generated from each of three normalization methods (VST CPM RPKM) and then averaged In general 247 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 8
correlation methods are more computationally efficient while MI methods are able to reveal non-linear 248
relationships (Li et al 2015c) PCC is widely used but may be influenced by outliers (Mukaka 2012) SCC 249
KCC and BIC are less sensitive to outliers because SCC and KCC only consider the rank information and BIC 250
calculates based on dataset median instead of mean (Serin et al 2016) Recently GCC has been shown to 251
be a better correlation method for gene expression analysis because of its capacity to detect non-linear 252
relationships and insensitivity to outliers (Ma and Wang 2012) CSC is widely used for text mining and 253
analyzing sparse data with many zeros (Dhillon and Modha 2001) ARACNE MRNET and CLR showed 254
extended gene-dependent relationships under variable biological settings (Margolin et al 2006 Faith et al 255
2007 Meyer et al 2007 Li et al 2013b) To estimate the effectiveness of the inference methods the same 256
testing parameters with AUROC calculations were performed as described for the testing of normalization 257
methods 258
Assessed by GO datasets the 277 AUROC values were averaged to create one average value for each of the 259
10 inference methods ranging from 0620 to 0724 (Fig 2D) The average AUROC across all normalization 260
methods for six correlation methods was 0718 while the average AUROC for the all four MI methods was 261
0646 The majority of the 277 GO terms had similar AUROC values in the different correlation method-262
generated GCNs and these patterns are different from those observed in the MI-generated GCNs (Fig 3A) 263
The similarity among different methods was also detectable by pairwise comparison and comparing Pearson 264
correlations between the different methods (Supplemental Fig 4A) 265
To evaluate network inference methods with the PPPTY dataset the AUROC values for 1720 genes were 266
averaged for each combination of normalization and inference methods (Fig 2E) This evaluation also showed 267
that the networks constructed using correlation methods resulted in higher AUROC values than MI methods 268
although the CSC method resulted in lower AUROC values than other correlation methods As demonstrated 269
for the GO evaluation results from correlation methods were more similar with each other than the MI methods 270
(Supplemental Fig 4B) Interestingly heatmap results indicated that a subset of genes consistently had higher 271
AUROC values when CSC MRNETCLR or AAMA were used (Fig 3B) although this includes a small enough 272
number of genes that the average AUROC value over the whole gene set was relatively low for those methods 273
The gene sets with highest AUROC values in PCC CSC or MRNET were extracted Characteristics of each 274
gene sets were compared in average expression (CPM) and average number of low expressed elements 275
(CPM lt 0) The CSC gene set had the smallest number of low expression elements and had higher average 276
expression than both the 1720 gene set and the PCC gene set (Supplemental Fig 5) This may indicate that 277
the CSC method is better at determining co-expression for highly expressed genes 278
The AUROC values from 26 targets of HDA101 ChIP-Seq datasets reveals that CSC GCN had the highest 279
AUROC value and the use of MRNETCLR GCNs resulted in slightly higher scores than correlation methods 280
(Fig 2F) This could be explained by the small number of targets creating skewed results but may also 281
indicate that CSCMI methods are more suitable for specific types of genes or interactions between genes 282
(Tzfadia et al 2016) HDA101 is a highly expressed gene in all samples with average expression value equals 283
to 864 CPM and minimum expression equals to 289 CPM so itrsquos possible that HDA101 is more suitable for 284 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 9
CSC method CPM and RPKM normalization methods had higher AUROC values than VST (Fig 2C) Using 285
two models of ARACNE (additive-AA and multiplicative-MA) the co-expression matrices contain less than 05 286
non-zero values for all comparisons and so these techniques were not included in any additional analyses 287
In conclusion our results indicated the widely-used correlation methods resulted in a more predictive maize 288
GCN from a single expression matrix but co-expression with some individual genes may be better detected 289
using MI methods Normalization method did not have a substantial influence on GCNrsquos performance so only 290
CPM normalization was used in conjunction with PCC SCC MRNT and CLR inference for subsequent 291
optimization of other parameters 292
293
Increase Sample Size Had a Positive Effect On GCN 294
GCN analysis can be accomplished with a variable number of samples and datasets but sample size can 295
influence the quality of the resulting GCN (Wei et al 2004 Ballouz et al 2015) Separate analyses were 296
conducted with different numbers of samples and experiments to empirically determine the effect of sample 297
number on GCN effectiveness The data in our analysis consisted of 17 experiments each including between 298
12 and 404 libraries For this analysis CPM normalization method followed by each of four inference methods 299
(PCC SCC MRNET and CLR) was applied to the 17 experiments and the 68 resulting networks were 300
evaluated by both GO and PPPTY 301
From GO and PPPTY evaluation all algorithms exhibit a positive linear relationship between sample size with 302
natural logarithm transformed and average AUROC values (Fig 4) The linear relationships are stronger in 303
PCC and SCC methods with higher r-square values indicating correlation methods benefit more from 304
increasing sample size Thus for building correlation-based GCNs as many samples as possible should be 305
included We also found that as seen for the total GCN analysis PCC and SCC had higher average AUROC 306
values than the MRNET and CLR methods for PPPTY and GO analysis for most of individual networks (Fig 5) 307
308
Ranked Aggregation of Networks Improved Performance of GCNs 309
Ranked aggregation for meta-analysis can also be modified to change the outcomes of GCN by buffering the 310
effect of sample heterogeneity (Zhong et al 2014 Wang et al 2015a Asnicar et al 2016) Aggregated rank 311
standardized correlationMI matrices were calculated from separate experiments to determine if this approach 312
enhanced GCN performance Aggregating individual networks together for meta-analysis can help to highlight 313
true co-expression interactions and reduce noise (Zhong et al 2014 Wang et al 2015a Wang et al 2015b) 314
This analysis was conducted with the 17 differently sized experiments using PCC SCC MRNET and CLR 315
method for GCN inference as we did previously resulting in 68 single GCNs The 17 experiments were 316
aggregated for PCC SCC MRNET and CLR individually and evaluated by GO and PPPTY datasets 317
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 10
Of the 4 aggregated networks that were evaluated the two correlation methods (PCC and SCC) had higher 318
AUROC values than the single network from 1266 samples (Figure 6 and Supplemental Fig 6) However this 319
aggregation strategy did not result in significant higher AUROC scores for the MRNET and CLR method 320
networks compared with single networks with 1266 samples (two-tail Wilcoxon rank test for GO evaluation p-321
values 0494 and 0796) It has been reported that MI estimation accuracy is dependent on sample size (Gao 322
et al 2015) therefore individual MI networks built with a small number of libraries may not demonstrate 323
improved accuracy from aggregation In conclusion the PCCSCC-built GCN performed best using a ranked 324
aggregation strategy and use of this strategy in combination with the other optimized parameters creates a 325
robust GCN 326
327
The Performance of Protein Networks Did Not Exceed Aggregation Networks 328
In many cases mRNA levels in a cell are of interest because mRNA level is thought to be related to the level 329
and function of a protein of interest However many researchers had found inconsistencies between mRNA 330
and protein level (Baerenfaller et al 2008 Schwanhaumlusser et al 2011 Ponnala et al 2014 Walley et al 331
2016) Although relatively less protein expression data is available this data is amenable to GCN construction 332
and could represent a more direct reflection of interacting proteins Using a non-modified protein expression 333
atlas from 23 maize tissues based upon mass spectrometry data (Walley et al 2016) four protein networks 334
were built with PCC SCC MRNET and CLR separately and then evaluated using the same PPPTY and GO 335
dataset as previously mentioned 336
GCNs constructed from protein expression did not exhibit superior AUROC values to those observed for RNA-337
Seq based GCN using the aggregation strategy (Fig 6) When evaluated by GO and PPPTY dataset the 338
performance of the protein network was lower than the aggregated network as well as the single network from 339
1266 samples To confirm this result a two-way ANOVA was computed with pairwise comparison for the GO 340
evaluation which showed that the effect of network type was significant (Supplemental Table S3) A 341
subsequent pairwise comparison using Wilcoxon rank sum test indicated that PCCSCC method were 342
significantly better than MRNETCLR (Supplemental Table S3) although MI methods may be superior for 343
some types of interactions 344
The raw protein expression data included 17862 genes of which 11429 genes overlapped with our RNA-Seq-345
based network and were therefore used for the analysis To demonstrate that the performance of the protein 346
network was not biased due by the selection of genes the PCC method was used for the whole 17862 genes 347
to construct a protein network (Supplemental Fig 7) No improvement could be detected from protein network 348
derived from 17862 genes with p-value equals to 0635 for GO evaluation and 0995 for PPPTY evaluation 349
from one-sided Wilcoxon rank sum test 350
351
PCC and SCC-built GCN Exhibit Identical Topological and Functional Properties 352 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 11
In addition to evaluation of network performance based upon biological characteristics networks can be 353
compared based upon several different network characteristics including clustering coefficient number of 354
nodes network heterogeneity (Dong and Horvath 2007) network centralization (Dong and Horvath 2007) 355
number of detected modules and number of genes in largest module Number of nodes is a basic construct in 356
graph theory depicting the scale of a network Clustering coefficient and number of modules are to model how 357
densely nodes are connected in networks Heterogeneity measures the variability of node connections 358
Centralization indicates how likely some nodes have significantly more connections than average In this 359
analysis each gene corresponds with a node Based on the extensive evaluation using biological 360
characteristics like protein-protein interactions (PPPTY) and predicted gene function (GO) three final maize 361
networks were selected for comparison of basic network characteristics based on their overall performance 362
PCC and SCC-built ranked aggregation network from 17 experiments (PA and SA) MRNET-built single 363
network from 1266 total samples (MS) The three networks were constrained to include the top one million 364
predicted interactions or edges 365
In prior studies most biological networks had scale-free architectures which fit a power-law distribution 366
(Barabasi et al 2004 Doncheva et al 2012 Schaefer et al 2014) For the three final maize networks 367
constructed using optimized parameters both neighborhood connectivity distribution (Supplemental Fig 8) and 368
node degree distribution (Supplemental Fig 9) fit power-law models with r-squared values over 07 The MS 369
network had the highest network centralization value The network heterogeneity value of MS was over two 370
times that of PA and SA indicating that MS may contain more highly interacting genes (Supplemental Table 371
S4) consistent with the observed highest centralization values for this network Centralization and 372
heterogeneity are two variants to model the degree distribution of networks A scale-free network with more 373
numbers of hubs has larger values of centralization and heterogeneity while a network with larger values of 374
centralization and heterogeneity may contain a larger number of hubs or the number of hubs is not significantly 375
large but the degree distributions are extremely imbalanced In biological networks many observations 376
connected large values of centralization and heterogeneity with more hub genes (Ma and Zeng 2003 Horvath 377
and Dong 2008 Iancu et al 2012 Scott-Boyer et al 2013) even though theoretically we cannot rule out the 378
possibility that high values were result from extremely imbalanced degree distribution For the MS network 379
most highly connected genes interacted with a large number of lowly connected genes this pattern is also 380
apparent reflected in the decreasing neighborhood connectivity distribution for the MS network (Supplemental 381
Fig 8) The genes with the most interactions are expected to act as key components in GCN networks 382
(Langfelder and Horvath 2008 Allen et al 2012) and likely represent central regulators of multi-protein 383
biological processes (Ma et al 2013 Du et al 2015) The top 1000 interacting genes from all networks were 384
analyzed in more detail as these were potential ldquohubrdquo genes that may regulate other expression patterns and 385
processes PA and SA shared 95 of the top 1000 interacting genes while MS had 835 unique genes (Fig 386
7A) 148 genes were shared among all three networks (Supplemental Table S5) making these genes strong 387
candidate for central biological regulators The annotation of these genes suggests their participation in a 388
range of basic cellular process (Fig 7C) including gene expression DNA replication translation and gene 389
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 6
Network Performance Does Not Differ Based Upon Normalization Method 176
To compare the efficacy of three normalization and ten inference methods a GCN was generated for each 177
combination of normalization and inference methods Furthermore all networks were rank-standardized to limit 178
the edge weight ranging from 0 to 1 (See Methods) All networks evaluations used the whole adjacency matrix 179
(1511615116 in RNA-Seq networks 1142911429 or 1786217862 in protein networks) without a cut-off 180
The performance of the different networks was measured by comparing the area under the receiver operator 181
characteristic curves (AUROC) AUROC is a measurement used to evaluate the accuracy of classification 182
models making it suitable for evaluating GCNs (Gillis and Pavlidis 2011 Ma and Wang 2012 Liu et al 2017) 183
AUROC values range from 0 to 1 with a value closer to 1 indicating that the network is discriminating 184
nonrandom patterns and perfect classification random networks returning values close to 05 and values 185
closer to 0 indicating a high degree of incorrect classification While an AUROC value close to 1 is optimal 186
values over 07 suggest good performance when analyzing large diverse networks (Gillis and Pavlidis 2011) 187
To set up the AUROC baseline for the random networks maize gene IDs were shuffled 10 (for MRNET and 188
CLR) or 1000 times (for PCC) from the normalized expression matrix The randomized expression matirx were 189
inferenced using designated alorgrithms and further evaluated The resulting AUROC values from randomized 190
networks were very close to 05 (Supplemental Table S2) 191
AUROC values were calculated and compared for three different network characteristics The first 192
characteristic was designed to test if the network identified genes with known or predicted co-expression 193
patterns based upon prior results and inclusion in two existing datasets that could serve as a positive control 194
for co-expression The maize metabolic pathway (MaizeCyc) contains 413 pathways with more than two genes 195
and was built based upon collection of evidence from genome annotation phylogenetic distance and known 196
genes in maize rice and Arabidopsis (Monaco et al 2013) The maize protein-protein interaction database 197
(PPIM) is based upon both predicted and experimentally detected protein interactions (Zhu et al 2016) and 198
was the second dataset used in this analysis Only high-confident interactions from PPIM were used as 199
defined by ranking top 5 in their model (Zhu et al 2016) For comparison with the GCN genes within the 200
same MaizeCyc or PPIM pathways were considered co-expressed The MaizeCyc and PPIM datasets were 201
combined and genes with less than 5 interactions were excluded from evaluation creating a compiled dataset 202
referred to herein as the Protein-Protein and Pathway dataset (PPPTY) PPPTY had 1720 genes and 104856 203
interactions that were used in this evaluation The AUROC value was calculated for each of the 1720 gene 204
terms 205
To assess the effect of normalization method on GCNs AUROC values for all ten inference methods were 206
averaged for each of the three normalization methods All three normalization methods scored similarly in 207
comparison with the PPPTY dataset (Fig 2B) with a mean AUROC value around 0575 for each suggesting 208
that the predicted networks were more selective than a random network 209
The second characteristic was the presence of similar gene ontology (GO) information for maize genes within 210
a detected co-expression set based upon ldquoguilt by associationrdquo that assumes specific subgroups of co-211 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 7
expressed genes have some shared functions (Wolfe et al 2005) GO annotations were downloaded from 212
AgriGO (Du et al 2010) which uses signature integration by InterPro to map gene IDs to GO terms rather 213
than co-expression data InterPro provided over 108 million stable GO terms to the functional protein 214
information database UniProtKB at release 2016_01(Sangrador-Vegas et al 2016) Thus the GO annotations 215
provide a reliable evaluation resource independent of co-expression data To assess this characteristic gene 216
ontology information was used in a neighbor voting algorithm (Gillis and Pavlidis 2011) for sets of co-217
expression matrices and compared Co-expression matrices were assessed by 3-fold cross-validation which 218
involved masking GO terms from some genes to test whether the masked GO terms could be predicted based 219
upon gene expression patterns 277 GO terms were included for this analysis 220
When GO characteristics were used to assess the networks all three normalization methods performed 221
similarly but the AUROC values were higher at around 0689 for each than those observed for comparisons 222
with PPPTY (Fig 2A) Because GO addresses gene functions and PPPTY emphasizes protein-protein 223
interactions this suggests that GCNs are better at predicting functional interactions than physical interactions 224
The p-value from one-way ANOVA for testing normalization method effect on PPPTY and GO dataset were 225
09535 and 04714 respectively confirming that the normalization method did not create a significant difference 226
in the AUROC scores associated with the GCNs for the characteristics that were tested 227
Finally proteins that regulate gene expression or modify chromatin structure might interact with the DNA of a 228
subset of co-expressed genes The interactions between such a protein and regulated DNA could be detected 229
by chromatin precipitation of associated DNA followed by DNA sequencing (ChIP-Seq) In maize there are five 230
ChIP-Seq datasets available (Bolduc et al 2012 Morohashi et al 2012 Li et al 2015a Pautler et al 2015 231
Yang et al 2016) some of which involving lowly expressed or tissue-specific genes For example Opaque2 is 232
specifically expressed in endosperm (Li et al 2015a) Knotted1 is expressed in SAM and floral tissues (Bolduc 233
et al 2012) and Pericarp Color1 has low expression except in inflorescence and seed (Morohashi et al 234
2012) Histone Deacetylase 101 (HDA101) ChIP-Seq data provided the largest dataset for comparison with 26 235
confirmed binding targets that are relatively high expressed in most maize tissues (Yang et al 2016) Histone 236
deacetylation often correlates with decreased in gene expression (Verdin and Ott 2014) High confidence 237
HDA101 targets were defined as those discovered by ChIP-Seq and that also showed increased gene 238
expression in hda101 mutant Networks associated with the 26 high confidence HDA101 targets were 239
compared by calculating AUROC Based upon this analysis the AUROC values were very similar among 240
networks normalized by VST CPM and RPKM (Fig 2C) which is consistent with GO and PPPTY evaluation 241
242
Correlation Methods Performs better than Mutual Information at Some Genes 243
After normalization of the expression matrices they can be processed by different methods for GCN inference 244
To optimize this step the AUROC values of six correlation (PCC SCC KCC GCC BIC CSC) and four mutual 245
information (MI) methods (AA MA MRNET CLR) were compared for the expression matrices that were 246
generated from each of three normalization methods (VST CPM RPKM) and then averaged In general 247 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 8
correlation methods are more computationally efficient while MI methods are able to reveal non-linear 248
relationships (Li et al 2015c) PCC is widely used but may be influenced by outliers (Mukaka 2012) SCC 249
KCC and BIC are less sensitive to outliers because SCC and KCC only consider the rank information and BIC 250
calculates based on dataset median instead of mean (Serin et al 2016) Recently GCC has been shown to 251
be a better correlation method for gene expression analysis because of its capacity to detect non-linear 252
relationships and insensitivity to outliers (Ma and Wang 2012) CSC is widely used for text mining and 253
analyzing sparse data with many zeros (Dhillon and Modha 2001) ARACNE MRNET and CLR showed 254
extended gene-dependent relationships under variable biological settings (Margolin et al 2006 Faith et al 255
2007 Meyer et al 2007 Li et al 2013b) To estimate the effectiveness of the inference methods the same 256
testing parameters with AUROC calculations were performed as described for the testing of normalization 257
methods 258
Assessed by GO datasets the 277 AUROC values were averaged to create one average value for each of the 259
10 inference methods ranging from 0620 to 0724 (Fig 2D) The average AUROC across all normalization 260
methods for six correlation methods was 0718 while the average AUROC for the all four MI methods was 261
0646 The majority of the 277 GO terms had similar AUROC values in the different correlation method-262
generated GCNs and these patterns are different from those observed in the MI-generated GCNs (Fig 3A) 263
The similarity among different methods was also detectable by pairwise comparison and comparing Pearson 264
correlations between the different methods (Supplemental Fig 4A) 265
To evaluate network inference methods with the PPPTY dataset the AUROC values for 1720 genes were 266
averaged for each combination of normalization and inference methods (Fig 2E) This evaluation also showed 267
that the networks constructed using correlation methods resulted in higher AUROC values than MI methods 268
although the CSC method resulted in lower AUROC values than other correlation methods As demonstrated 269
for the GO evaluation results from correlation methods were more similar with each other than the MI methods 270
(Supplemental Fig 4B) Interestingly heatmap results indicated that a subset of genes consistently had higher 271
AUROC values when CSC MRNETCLR or AAMA were used (Fig 3B) although this includes a small enough 272
number of genes that the average AUROC value over the whole gene set was relatively low for those methods 273
The gene sets with highest AUROC values in PCC CSC or MRNET were extracted Characteristics of each 274
gene sets were compared in average expression (CPM) and average number of low expressed elements 275
(CPM lt 0) The CSC gene set had the smallest number of low expression elements and had higher average 276
expression than both the 1720 gene set and the PCC gene set (Supplemental Fig 5) This may indicate that 277
the CSC method is better at determining co-expression for highly expressed genes 278
The AUROC values from 26 targets of HDA101 ChIP-Seq datasets reveals that CSC GCN had the highest 279
AUROC value and the use of MRNETCLR GCNs resulted in slightly higher scores than correlation methods 280
(Fig 2F) This could be explained by the small number of targets creating skewed results but may also 281
indicate that CSCMI methods are more suitable for specific types of genes or interactions between genes 282
(Tzfadia et al 2016) HDA101 is a highly expressed gene in all samples with average expression value equals 283
to 864 CPM and minimum expression equals to 289 CPM so itrsquos possible that HDA101 is more suitable for 284 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 9
CSC method CPM and RPKM normalization methods had higher AUROC values than VST (Fig 2C) Using 285
two models of ARACNE (additive-AA and multiplicative-MA) the co-expression matrices contain less than 05 286
non-zero values for all comparisons and so these techniques were not included in any additional analyses 287
In conclusion our results indicated the widely-used correlation methods resulted in a more predictive maize 288
GCN from a single expression matrix but co-expression with some individual genes may be better detected 289
using MI methods Normalization method did not have a substantial influence on GCNrsquos performance so only 290
CPM normalization was used in conjunction with PCC SCC MRNT and CLR inference for subsequent 291
optimization of other parameters 292
293
Increase Sample Size Had a Positive Effect On GCN 294
GCN analysis can be accomplished with a variable number of samples and datasets but sample size can 295
influence the quality of the resulting GCN (Wei et al 2004 Ballouz et al 2015) Separate analyses were 296
conducted with different numbers of samples and experiments to empirically determine the effect of sample 297
number on GCN effectiveness The data in our analysis consisted of 17 experiments each including between 298
12 and 404 libraries For this analysis CPM normalization method followed by each of four inference methods 299
(PCC SCC MRNET and CLR) was applied to the 17 experiments and the 68 resulting networks were 300
evaluated by both GO and PPPTY 301
From GO and PPPTY evaluation all algorithms exhibit a positive linear relationship between sample size with 302
natural logarithm transformed and average AUROC values (Fig 4) The linear relationships are stronger in 303
PCC and SCC methods with higher r-square values indicating correlation methods benefit more from 304
increasing sample size Thus for building correlation-based GCNs as many samples as possible should be 305
included We also found that as seen for the total GCN analysis PCC and SCC had higher average AUROC 306
values than the MRNET and CLR methods for PPPTY and GO analysis for most of individual networks (Fig 5) 307
308
Ranked Aggregation of Networks Improved Performance of GCNs 309
Ranked aggregation for meta-analysis can also be modified to change the outcomes of GCN by buffering the 310
effect of sample heterogeneity (Zhong et al 2014 Wang et al 2015a Asnicar et al 2016) Aggregated rank 311
standardized correlationMI matrices were calculated from separate experiments to determine if this approach 312
enhanced GCN performance Aggregating individual networks together for meta-analysis can help to highlight 313
true co-expression interactions and reduce noise (Zhong et al 2014 Wang et al 2015a Wang et al 2015b) 314
This analysis was conducted with the 17 differently sized experiments using PCC SCC MRNET and CLR 315
method for GCN inference as we did previously resulting in 68 single GCNs The 17 experiments were 316
aggregated for PCC SCC MRNET and CLR individually and evaluated by GO and PPPTY datasets 317
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 10
Of the 4 aggregated networks that were evaluated the two correlation methods (PCC and SCC) had higher 318
AUROC values than the single network from 1266 samples (Figure 6 and Supplemental Fig 6) However this 319
aggregation strategy did not result in significant higher AUROC scores for the MRNET and CLR method 320
networks compared with single networks with 1266 samples (two-tail Wilcoxon rank test for GO evaluation p-321
values 0494 and 0796) It has been reported that MI estimation accuracy is dependent on sample size (Gao 322
et al 2015) therefore individual MI networks built with a small number of libraries may not demonstrate 323
improved accuracy from aggregation In conclusion the PCCSCC-built GCN performed best using a ranked 324
aggregation strategy and use of this strategy in combination with the other optimized parameters creates a 325
robust GCN 326
327
The Performance of Protein Networks Did Not Exceed Aggregation Networks 328
In many cases mRNA levels in a cell are of interest because mRNA level is thought to be related to the level 329
and function of a protein of interest However many researchers had found inconsistencies between mRNA 330
and protein level (Baerenfaller et al 2008 Schwanhaumlusser et al 2011 Ponnala et al 2014 Walley et al 331
2016) Although relatively less protein expression data is available this data is amenable to GCN construction 332
and could represent a more direct reflection of interacting proteins Using a non-modified protein expression 333
atlas from 23 maize tissues based upon mass spectrometry data (Walley et al 2016) four protein networks 334
were built with PCC SCC MRNET and CLR separately and then evaluated using the same PPPTY and GO 335
dataset as previously mentioned 336
GCNs constructed from protein expression did not exhibit superior AUROC values to those observed for RNA-337
Seq based GCN using the aggregation strategy (Fig 6) When evaluated by GO and PPPTY dataset the 338
performance of the protein network was lower than the aggregated network as well as the single network from 339
1266 samples To confirm this result a two-way ANOVA was computed with pairwise comparison for the GO 340
evaluation which showed that the effect of network type was significant (Supplemental Table S3) A 341
subsequent pairwise comparison using Wilcoxon rank sum test indicated that PCCSCC method were 342
significantly better than MRNETCLR (Supplemental Table S3) although MI methods may be superior for 343
some types of interactions 344
The raw protein expression data included 17862 genes of which 11429 genes overlapped with our RNA-Seq-345
based network and were therefore used for the analysis To demonstrate that the performance of the protein 346
network was not biased due by the selection of genes the PCC method was used for the whole 17862 genes 347
to construct a protein network (Supplemental Fig 7) No improvement could be detected from protein network 348
derived from 17862 genes with p-value equals to 0635 for GO evaluation and 0995 for PPPTY evaluation 349
from one-sided Wilcoxon rank sum test 350
351
PCC and SCC-built GCN Exhibit Identical Topological and Functional Properties 352 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 11
In addition to evaluation of network performance based upon biological characteristics networks can be 353
compared based upon several different network characteristics including clustering coefficient number of 354
nodes network heterogeneity (Dong and Horvath 2007) network centralization (Dong and Horvath 2007) 355
number of detected modules and number of genes in largest module Number of nodes is a basic construct in 356
graph theory depicting the scale of a network Clustering coefficient and number of modules are to model how 357
densely nodes are connected in networks Heterogeneity measures the variability of node connections 358
Centralization indicates how likely some nodes have significantly more connections than average In this 359
analysis each gene corresponds with a node Based on the extensive evaluation using biological 360
characteristics like protein-protein interactions (PPPTY) and predicted gene function (GO) three final maize 361
networks were selected for comparison of basic network characteristics based on their overall performance 362
PCC and SCC-built ranked aggregation network from 17 experiments (PA and SA) MRNET-built single 363
network from 1266 total samples (MS) The three networks were constrained to include the top one million 364
predicted interactions or edges 365
In prior studies most biological networks had scale-free architectures which fit a power-law distribution 366
(Barabasi et al 2004 Doncheva et al 2012 Schaefer et al 2014) For the three final maize networks 367
constructed using optimized parameters both neighborhood connectivity distribution (Supplemental Fig 8) and 368
node degree distribution (Supplemental Fig 9) fit power-law models with r-squared values over 07 The MS 369
network had the highest network centralization value The network heterogeneity value of MS was over two 370
times that of PA and SA indicating that MS may contain more highly interacting genes (Supplemental Table 371
S4) consistent with the observed highest centralization values for this network Centralization and 372
heterogeneity are two variants to model the degree distribution of networks A scale-free network with more 373
numbers of hubs has larger values of centralization and heterogeneity while a network with larger values of 374
centralization and heterogeneity may contain a larger number of hubs or the number of hubs is not significantly 375
large but the degree distributions are extremely imbalanced In biological networks many observations 376
connected large values of centralization and heterogeneity with more hub genes (Ma and Zeng 2003 Horvath 377
and Dong 2008 Iancu et al 2012 Scott-Boyer et al 2013) even though theoretically we cannot rule out the 378
possibility that high values were result from extremely imbalanced degree distribution For the MS network 379
most highly connected genes interacted with a large number of lowly connected genes this pattern is also 380
apparent reflected in the decreasing neighborhood connectivity distribution for the MS network (Supplemental 381
Fig 8) The genes with the most interactions are expected to act as key components in GCN networks 382
(Langfelder and Horvath 2008 Allen et al 2012) and likely represent central regulators of multi-protein 383
biological processes (Ma et al 2013 Du et al 2015) The top 1000 interacting genes from all networks were 384
analyzed in more detail as these were potential ldquohubrdquo genes that may regulate other expression patterns and 385
processes PA and SA shared 95 of the top 1000 interacting genes while MS had 835 unique genes (Fig 386
7A) 148 genes were shared among all three networks (Supplemental Table S5) making these genes strong 387
candidate for central biological regulators The annotation of these genes suggests their participation in a 388
range of basic cellular process (Fig 7C) including gene expression DNA replication translation and gene 389
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 7
expressed genes have some shared functions (Wolfe et al 2005) GO annotations were downloaded from 212
AgriGO (Du et al 2010) which uses signature integration by InterPro to map gene IDs to GO terms rather 213
than co-expression data InterPro provided over 108 million stable GO terms to the functional protein 214
information database UniProtKB at release 2016_01(Sangrador-Vegas et al 2016) Thus the GO annotations 215
provide a reliable evaluation resource independent of co-expression data To assess this characteristic gene 216
ontology information was used in a neighbor voting algorithm (Gillis and Pavlidis 2011) for sets of co-217
expression matrices and compared Co-expression matrices were assessed by 3-fold cross-validation which 218
involved masking GO terms from some genes to test whether the masked GO terms could be predicted based 219
upon gene expression patterns 277 GO terms were included for this analysis 220
When GO characteristics were used to assess the networks all three normalization methods performed 221
similarly but the AUROC values were higher at around 0689 for each than those observed for comparisons 222
with PPPTY (Fig 2A) Because GO addresses gene functions and PPPTY emphasizes protein-protein 223
interactions this suggests that GCNs are better at predicting functional interactions than physical interactions 224
The p-value from one-way ANOVA for testing normalization method effect on PPPTY and GO dataset were 225
09535 and 04714 respectively confirming that the normalization method did not create a significant difference 226
in the AUROC scores associated with the GCNs for the characteristics that were tested 227
Finally proteins that regulate gene expression or modify chromatin structure might interact with the DNA of a 228
subset of co-expressed genes The interactions between such a protein and regulated DNA could be detected 229
by chromatin precipitation of associated DNA followed by DNA sequencing (ChIP-Seq) In maize there are five 230
ChIP-Seq datasets available (Bolduc et al 2012 Morohashi et al 2012 Li et al 2015a Pautler et al 2015 231
Yang et al 2016) some of which involving lowly expressed or tissue-specific genes For example Opaque2 is 232
specifically expressed in endosperm (Li et al 2015a) Knotted1 is expressed in SAM and floral tissues (Bolduc 233
et al 2012) and Pericarp Color1 has low expression except in inflorescence and seed (Morohashi et al 234
2012) Histone Deacetylase 101 (HDA101) ChIP-Seq data provided the largest dataset for comparison with 26 235
confirmed binding targets that are relatively high expressed in most maize tissues (Yang et al 2016) Histone 236
deacetylation often correlates with decreased in gene expression (Verdin and Ott 2014) High confidence 237
HDA101 targets were defined as those discovered by ChIP-Seq and that also showed increased gene 238
expression in hda101 mutant Networks associated with the 26 high confidence HDA101 targets were 239
compared by calculating AUROC Based upon this analysis the AUROC values were very similar among 240
networks normalized by VST CPM and RPKM (Fig 2C) which is consistent with GO and PPPTY evaluation 241
242
Correlation Methods Performs better than Mutual Information at Some Genes 243
After normalization of the expression matrices they can be processed by different methods for GCN inference 244
To optimize this step the AUROC values of six correlation (PCC SCC KCC GCC BIC CSC) and four mutual 245
information (MI) methods (AA MA MRNET CLR) were compared for the expression matrices that were 246
generated from each of three normalization methods (VST CPM RPKM) and then averaged In general 247 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 8
correlation methods are more computationally efficient while MI methods are able to reveal non-linear 248
relationships (Li et al 2015c) PCC is widely used but may be influenced by outliers (Mukaka 2012) SCC 249
KCC and BIC are less sensitive to outliers because SCC and KCC only consider the rank information and BIC 250
calculates based on dataset median instead of mean (Serin et al 2016) Recently GCC has been shown to 251
be a better correlation method for gene expression analysis because of its capacity to detect non-linear 252
relationships and insensitivity to outliers (Ma and Wang 2012) CSC is widely used for text mining and 253
analyzing sparse data with many zeros (Dhillon and Modha 2001) ARACNE MRNET and CLR showed 254
extended gene-dependent relationships under variable biological settings (Margolin et al 2006 Faith et al 255
2007 Meyer et al 2007 Li et al 2013b) To estimate the effectiveness of the inference methods the same 256
testing parameters with AUROC calculations were performed as described for the testing of normalization 257
methods 258
Assessed by GO datasets the 277 AUROC values were averaged to create one average value for each of the 259
10 inference methods ranging from 0620 to 0724 (Fig 2D) The average AUROC across all normalization 260
methods for six correlation methods was 0718 while the average AUROC for the all four MI methods was 261
0646 The majority of the 277 GO terms had similar AUROC values in the different correlation method-262
generated GCNs and these patterns are different from those observed in the MI-generated GCNs (Fig 3A) 263
The similarity among different methods was also detectable by pairwise comparison and comparing Pearson 264
correlations between the different methods (Supplemental Fig 4A) 265
To evaluate network inference methods with the PPPTY dataset the AUROC values for 1720 genes were 266
averaged for each combination of normalization and inference methods (Fig 2E) This evaluation also showed 267
that the networks constructed using correlation methods resulted in higher AUROC values than MI methods 268
although the CSC method resulted in lower AUROC values than other correlation methods As demonstrated 269
for the GO evaluation results from correlation methods were more similar with each other than the MI methods 270
(Supplemental Fig 4B) Interestingly heatmap results indicated that a subset of genes consistently had higher 271
AUROC values when CSC MRNETCLR or AAMA were used (Fig 3B) although this includes a small enough 272
number of genes that the average AUROC value over the whole gene set was relatively low for those methods 273
The gene sets with highest AUROC values in PCC CSC or MRNET were extracted Characteristics of each 274
gene sets were compared in average expression (CPM) and average number of low expressed elements 275
(CPM lt 0) The CSC gene set had the smallest number of low expression elements and had higher average 276
expression than both the 1720 gene set and the PCC gene set (Supplemental Fig 5) This may indicate that 277
the CSC method is better at determining co-expression for highly expressed genes 278
The AUROC values from 26 targets of HDA101 ChIP-Seq datasets reveals that CSC GCN had the highest 279
AUROC value and the use of MRNETCLR GCNs resulted in slightly higher scores than correlation methods 280
(Fig 2F) This could be explained by the small number of targets creating skewed results but may also 281
indicate that CSCMI methods are more suitable for specific types of genes or interactions between genes 282
(Tzfadia et al 2016) HDA101 is a highly expressed gene in all samples with average expression value equals 283
to 864 CPM and minimum expression equals to 289 CPM so itrsquos possible that HDA101 is more suitable for 284 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 9
CSC method CPM and RPKM normalization methods had higher AUROC values than VST (Fig 2C) Using 285
two models of ARACNE (additive-AA and multiplicative-MA) the co-expression matrices contain less than 05 286
non-zero values for all comparisons and so these techniques were not included in any additional analyses 287
In conclusion our results indicated the widely-used correlation methods resulted in a more predictive maize 288
GCN from a single expression matrix but co-expression with some individual genes may be better detected 289
using MI methods Normalization method did not have a substantial influence on GCNrsquos performance so only 290
CPM normalization was used in conjunction with PCC SCC MRNT and CLR inference for subsequent 291
optimization of other parameters 292
293
Increase Sample Size Had a Positive Effect On GCN 294
GCN analysis can be accomplished with a variable number of samples and datasets but sample size can 295
influence the quality of the resulting GCN (Wei et al 2004 Ballouz et al 2015) Separate analyses were 296
conducted with different numbers of samples and experiments to empirically determine the effect of sample 297
number on GCN effectiveness The data in our analysis consisted of 17 experiments each including between 298
12 and 404 libraries For this analysis CPM normalization method followed by each of four inference methods 299
(PCC SCC MRNET and CLR) was applied to the 17 experiments and the 68 resulting networks were 300
evaluated by both GO and PPPTY 301
From GO and PPPTY evaluation all algorithms exhibit a positive linear relationship between sample size with 302
natural logarithm transformed and average AUROC values (Fig 4) The linear relationships are stronger in 303
PCC and SCC methods with higher r-square values indicating correlation methods benefit more from 304
increasing sample size Thus for building correlation-based GCNs as many samples as possible should be 305
included We also found that as seen for the total GCN analysis PCC and SCC had higher average AUROC 306
values than the MRNET and CLR methods for PPPTY and GO analysis for most of individual networks (Fig 5) 307
308
Ranked Aggregation of Networks Improved Performance of GCNs 309
Ranked aggregation for meta-analysis can also be modified to change the outcomes of GCN by buffering the 310
effect of sample heterogeneity (Zhong et al 2014 Wang et al 2015a Asnicar et al 2016) Aggregated rank 311
standardized correlationMI matrices were calculated from separate experiments to determine if this approach 312
enhanced GCN performance Aggregating individual networks together for meta-analysis can help to highlight 313
true co-expression interactions and reduce noise (Zhong et al 2014 Wang et al 2015a Wang et al 2015b) 314
This analysis was conducted with the 17 differently sized experiments using PCC SCC MRNET and CLR 315
method for GCN inference as we did previously resulting in 68 single GCNs The 17 experiments were 316
aggregated for PCC SCC MRNET and CLR individually and evaluated by GO and PPPTY datasets 317
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 10
Of the 4 aggregated networks that were evaluated the two correlation methods (PCC and SCC) had higher 318
AUROC values than the single network from 1266 samples (Figure 6 and Supplemental Fig 6) However this 319
aggregation strategy did not result in significant higher AUROC scores for the MRNET and CLR method 320
networks compared with single networks with 1266 samples (two-tail Wilcoxon rank test for GO evaluation p-321
values 0494 and 0796) It has been reported that MI estimation accuracy is dependent on sample size (Gao 322
et al 2015) therefore individual MI networks built with a small number of libraries may not demonstrate 323
improved accuracy from aggregation In conclusion the PCCSCC-built GCN performed best using a ranked 324
aggregation strategy and use of this strategy in combination with the other optimized parameters creates a 325
robust GCN 326
327
The Performance of Protein Networks Did Not Exceed Aggregation Networks 328
In many cases mRNA levels in a cell are of interest because mRNA level is thought to be related to the level 329
and function of a protein of interest However many researchers had found inconsistencies between mRNA 330
and protein level (Baerenfaller et al 2008 Schwanhaumlusser et al 2011 Ponnala et al 2014 Walley et al 331
2016) Although relatively less protein expression data is available this data is amenable to GCN construction 332
and could represent a more direct reflection of interacting proteins Using a non-modified protein expression 333
atlas from 23 maize tissues based upon mass spectrometry data (Walley et al 2016) four protein networks 334
were built with PCC SCC MRNET and CLR separately and then evaluated using the same PPPTY and GO 335
dataset as previously mentioned 336
GCNs constructed from protein expression did not exhibit superior AUROC values to those observed for RNA-337
Seq based GCN using the aggregation strategy (Fig 6) When evaluated by GO and PPPTY dataset the 338
performance of the protein network was lower than the aggregated network as well as the single network from 339
1266 samples To confirm this result a two-way ANOVA was computed with pairwise comparison for the GO 340
evaluation which showed that the effect of network type was significant (Supplemental Table S3) A 341
subsequent pairwise comparison using Wilcoxon rank sum test indicated that PCCSCC method were 342
significantly better than MRNETCLR (Supplemental Table S3) although MI methods may be superior for 343
some types of interactions 344
The raw protein expression data included 17862 genes of which 11429 genes overlapped with our RNA-Seq-345
based network and were therefore used for the analysis To demonstrate that the performance of the protein 346
network was not biased due by the selection of genes the PCC method was used for the whole 17862 genes 347
to construct a protein network (Supplemental Fig 7) No improvement could be detected from protein network 348
derived from 17862 genes with p-value equals to 0635 for GO evaluation and 0995 for PPPTY evaluation 349
from one-sided Wilcoxon rank sum test 350
351
PCC and SCC-built GCN Exhibit Identical Topological and Functional Properties 352 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 11
In addition to evaluation of network performance based upon biological characteristics networks can be 353
compared based upon several different network characteristics including clustering coefficient number of 354
nodes network heterogeneity (Dong and Horvath 2007) network centralization (Dong and Horvath 2007) 355
number of detected modules and number of genes in largest module Number of nodes is a basic construct in 356
graph theory depicting the scale of a network Clustering coefficient and number of modules are to model how 357
densely nodes are connected in networks Heterogeneity measures the variability of node connections 358
Centralization indicates how likely some nodes have significantly more connections than average In this 359
analysis each gene corresponds with a node Based on the extensive evaluation using biological 360
characteristics like protein-protein interactions (PPPTY) and predicted gene function (GO) three final maize 361
networks were selected for comparison of basic network characteristics based on their overall performance 362
PCC and SCC-built ranked aggregation network from 17 experiments (PA and SA) MRNET-built single 363
network from 1266 total samples (MS) The three networks were constrained to include the top one million 364
predicted interactions or edges 365
In prior studies most biological networks had scale-free architectures which fit a power-law distribution 366
(Barabasi et al 2004 Doncheva et al 2012 Schaefer et al 2014) For the three final maize networks 367
constructed using optimized parameters both neighborhood connectivity distribution (Supplemental Fig 8) and 368
node degree distribution (Supplemental Fig 9) fit power-law models with r-squared values over 07 The MS 369
network had the highest network centralization value The network heterogeneity value of MS was over two 370
times that of PA and SA indicating that MS may contain more highly interacting genes (Supplemental Table 371
S4) consistent with the observed highest centralization values for this network Centralization and 372
heterogeneity are two variants to model the degree distribution of networks A scale-free network with more 373
numbers of hubs has larger values of centralization and heterogeneity while a network with larger values of 374
centralization and heterogeneity may contain a larger number of hubs or the number of hubs is not significantly 375
large but the degree distributions are extremely imbalanced In biological networks many observations 376
connected large values of centralization and heterogeneity with more hub genes (Ma and Zeng 2003 Horvath 377
and Dong 2008 Iancu et al 2012 Scott-Boyer et al 2013) even though theoretically we cannot rule out the 378
possibility that high values were result from extremely imbalanced degree distribution For the MS network 379
most highly connected genes interacted with a large number of lowly connected genes this pattern is also 380
apparent reflected in the decreasing neighborhood connectivity distribution for the MS network (Supplemental 381
Fig 8) The genes with the most interactions are expected to act as key components in GCN networks 382
(Langfelder and Horvath 2008 Allen et al 2012) and likely represent central regulators of multi-protein 383
biological processes (Ma et al 2013 Du et al 2015) The top 1000 interacting genes from all networks were 384
analyzed in more detail as these were potential ldquohubrdquo genes that may regulate other expression patterns and 385
processes PA and SA shared 95 of the top 1000 interacting genes while MS had 835 unique genes (Fig 386
7A) 148 genes were shared among all three networks (Supplemental Table S5) making these genes strong 387
candidate for central biological regulators The annotation of these genes suggests their participation in a 388
range of basic cellular process (Fig 7C) including gene expression DNA replication translation and gene 389
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 8
correlation methods are more computationally efficient while MI methods are able to reveal non-linear 248
relationships (Li et al 2015c) PCC is widely used but may be influenced by outliers (Mukaka 2012) SCC 249
KCC and BIC are less sensitive to outliers because SCC and KCC only consider the rank information and BIC 250
calculates based on dataset median instead of mean (Serin et al 2016) Recently GCC has been shown to 251
be a better correlation method for gene expression analysis because of its capacity to detect non-linear 252
relationships and insensitivity to outliers (Ma and Wang 2012) CSC is widely used for text mining and 253
analyzing sparse data with many zeros (Dhillon and Modha 2001) ARACNE MRNET and CLR showed 254
extended gene-dependent relationships under variable biological settings (Margolin et al 2006 Faith et al 255
2007 Meyer et al 2007 Li et al 2013b) To estimate the effectiveness of the inference methods the same 256
testing parameters with AUROC calculations were performed as described for the testing of normalization 257
methods 258
Assessed by GO datasets the 277 AUROC values were averaged to create one average value for each of the 259
10 inference methods ranging from 0620 to 0724 (Fig 2D) The average AUROC across all normalization 260
methods for six correlation methods was 0718 while the average AUROC for the all four MI methods was 261
0646 The majority of the 277 GO terms had similar AUROC values in the different correlation method-262
generated GCNs and these patterns are different from those observed in the MI-generated GCNs (Fig 3A) 263
The similarity among different methods was also detectable by pairwise comparison and comparing Pearson 264
correlations between the different methods (Supplemental Fig 4A) 265
To evaluate network inference methods with the PPPTY dataset the AUROC values for 1720 genes were 266
averaged for each combination of normalization and inference methods (Fig 2E) This evaluation also showed 267
that the networks constructed using correlation methods resulted in higher AUROC values than MI methods 268
although the CSC method resulted in lower AUROC values than other correlation methods As demonstrated 269
for the GO evaluation results from correlation methods were more similar with each other than the MI methods 270
(Supplemental Fig 4B) Interestingly heatmap results indicated that a subset of genes consistently had higher 271
AUROC values when CSC MRNETCLR or AAMA were used (Fig 3B) although this includes a small enough 272
number of genes that the average AUROC value over the whole gene set was relatively low for those methods 273
The gene sets with highest AUROC values in PCC CSC or MRNET were extracted Characteristics of each 274
gene sets were compared in average expression (CPM) and average number of low expressed elements 275
(CPM lt 0) The CSC gene set had the smallest number of low expression elements and had higher average 276
expression than both the 1720 gene set and the PCC gene set (Supplemental Fig 5) This may indicate that 277
the CSC method is better at determining co-expression for highly expressed genes 278
The AUROC values from 26 targets of HDA101 ChIP-Seq datasets reveals that CSC GCN had the highest 279
AUROC value and the use of MRNETCLR GCNs resulted in slightly higher scores than correlation methods 280
(Fig 2F) This could be explained by the small number of targets creating skewed results but may also 281
indicate that CSCMI methods are more suitable for specific types of genes or interactions between genes 282
(Tzfadia et al 2016) HDA101 is a highly expressed gene in all samples with average expression value equals 283
to 864 CPM and minimum expression equals to 289 CPM so itrsquos possible that HDA101 is more suitable for 284 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 9
CSC method CPM and RPKM normalization methods had higher AUROC values than VST (Fig 2C) Using 285
two models of ARACNE (additive-AA and multiplicative-MA) the co-expression matrices contain less than 05 286
non-zero values for all comparisons and so these techniques were not included in any additional analyses 287
In conclusion our results indicated the widely-used correlation methods resulted in a more predictive maize 288
GCN from a single expression matrix but co-expression with some individual genes may be better detected 289
using MI methods Normalization method did not have a substantial influence on GCNrsquos performance so only 290
CPM normalization was used in conjunction with PCC SCC MRNT and CLR inference for subsequent 291
optimization of other parameters 292
293
Increase Sample Size Had a Positive Effect On GCN 294
GCN analysis can be accomplished with a variable number of samples and datasets but sample size can 295
influence the quality of the resulting GCN (Wei et al 2004 Ballouz et al 2015) Separate analyses were 296
conducted with different numbers of samples and experiments to empirically determine the effect of sample 297
number on GCN effectiveness The data in our analysis consisted of 17 experiments each including between 298
12 and 404 libraries For this analysis CPM normalization method followed by each of four inference methods 299
(PCC SCC MRNET and CLR) was applied to the 17 experiments and the 68 resulting networks were 300
evaluated by both GO and PPPTY 301
From GO and PPPTY evaluation all algorithms exhibit a positive linear relationship between sample size with 302
natural logarithm transformed and average AUROC values (Fig 4) The linear relationships are stronger in 303
PCC and SCC methods with higher r-square values indicating correlation methods benefit more from 304
increasing sample size Thus for building correlation-based GCNs as many samples as possible should be 305
included We also found that as seen for the total GCN analysis PCC and SCC had higher average AUROC 306
values than the MRNET and CLR methods for PPPTY and GO analysis for most of individual networks (Fig 5) 307
308
Ranked Aggregation of Networks Improved Performance of GCNs 309
Ranked aggregation for meta-analysis can also be modified to change the outcomes of GCN by buffering the 310
effect of sample heterogeneity (Zhong et al 2014 Wang et al 2015a Asnicar et al 2016) Aggregated rank 311
standardized correlationMI matrices were calculated from separate experiments to determine if this approach 312
enhanced GCN performance Aggregating individual networks together for meta-analysis can help to highlight 313
true co-expression interactions and reduce noise (Zhong et al 2014 Wang et al 2015a Wang et al 2015b) 314
This analysis was conducted with the 17 differently sized experiments using PCC SCC MRNET and CLR 315
method for GCN inference as we did previously resulting in 68 single GCNs The 17 experiments were 316
aggregated for PCC SCC MRNET and CLR individually and evaluated by GO and PPPTY datasets 317
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 10
Of the 4 aggregated networks that were evaluated the two correlation methods (PCC and SCC) had higher 318
AUROC values than the single network from 1266 samples (Figure 6 and Supplemental Fig 6) However this 319
aggregation strategy did not result in significant higher AUROC scores for the MRNET and CLR method 320
networks compared with single networks with 1266 samples (two-tail Wilcoxon rank test for GO evaluation p-321
values 0494 and 0796) It has been reported that MI estimation accuracy is dependent on sample size (Gao 322
et al 2015) therefore individual MI networks built with a small number of libraries may not demonstrate 323
improved accuracy from aggregation In conclusion the PCCSCC-built GCN performed best using a ranked 324
aggregation strategy and use of this strategy in combination with the other optimized parameters creates a 325
robust GCN 326
327
The Performance of Protein Networks Did Not Exceed Aggregation Networks 328
In many cases mRNA levels in a cell are of interest because mRNA level is thought to be related to the level 329
and function of a protein of interest However many researchers had found inconsistencies between mRNA 330
and protein level (Baerenfaller et al 2008 Schwanhaumlusser et al 2011 Ponnala et al 2014 Walley et al 331
2016) Although relatively less protein expression data is available this data is amenable to GCN construction 332
and could represent a more direct reflection of interacting proteins Using a non-modified protein expression 333
atlas from 23 maize tissues based upon mass spectrometry data (Walley et al 2016) four protein networks 334
were built with PCC SCC MRNET and CLR separately and then evaluated using the same PPPTY and GO 335
dataset as previously mentioned 336
GCNs constructed from protein expression did not exhibit superior AUROC values to those observed for RNA-337
Seq based GCN using the aggregation strategy (Fig 6) When evaluated by GO and PPPTY dataset the 338
performance of the protein network was lower than the aggregated network as well as the single network from 339
1266 samples To confirm this result a two-way ANOVA was computed with pairwise comparison for the GO 340
evaluation which showed that the effect of network type was significant (Supplemental Table S3) A 341
subsequent pairwise comparison using Wilcoxon rank sum test indicated that PCCSCC method were 342
significantly better than MRNETCLR (Supplemental Table S3) although MI methods may be superior for 343
some types of interactions 344
The raw protein expression data included 17862 genes of which 11429 genes overlapped with our RNA-Seq-345
based network and were therefore used for the analysis To demonstrate that the performance of the protein 346
network was not biased due by the selection of genes the PCC method was used for the whole 17862 genes 347
to construct a protein network (Supplemental Fig 7) No improvement could be detected from protein network 348
derived from 17862 genes with p-value equals to 0635 for GO evaluation and 0995 for PPPTY evaluation 349
from one-sided Wilcoxon rank sum test 350
351
PCC and SCC-built GCN Exhibit Identical Topological and Functional Properties 352 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 11
In addition to evaluation of network performance based upon biological characteristics networks can be 353
compared based upon several different network characteristics including clustering coefficient number of 354
nodes network heterogeneity (Dong and Horvath 2007) network centralization (Dong and Horvath 2007) 355
number of detected modules and number of genes in largest module Number of nodes is a basic construct in 356
graph theory depicting the scale of a network Clustering coefficient and number of modules are to model how 357
densely nodes are connected in networks Heterogeneity measures the variability of node connections 358
Centralization indicates how likely some nodes have significantly more connections than average In this 359
analysis each gene corresponds with a node Based on the extensive evaluation using biological 360
characteristics like protein-protein interactions (PPPTY) and predicted gene function (GO) three final maize 361
networks were selected for comparison of basic network characteristics based on their overall performance 362
PCC and SCC-built ranked aggregation network from 17 experiments (PA and SA) MRNET-built single 363
network from 1266 total samples (MS) The three networks were constrained to include the top one million 364
predicted interactions or edges 365
In prior studies most biological networks had scale-free architectures which fit a power-law distribution 366
(Barabasi et al 2004 Doncheva et al 2012 Schaefer et al 2014) For the three final maize networks 367
constructed using optimized parameters both neighborhood connectivity distribution (Supplemental Fig 8) and 368
node degree distribution (Supplemental Fig 9) fit power-law models with r-squared values over 07 The MS 369
network had the highest network centralization value The network heterogeneity value of MS was over two 370
times that of PA and SA indicating that MS may contain more highly interacting genes (Supplemental Table 371
S4) consistent with the observed highest centralization values for this network Centralization and 372
heterogeneity are two variants to model the degree distribution of networks A scale-free network with more 373
numbers of hubs has larger values of centralization and heterogeneity while a network with larger values of 374
centralization and heterogeneity may contain a larger number of hubs or the number of hubs is not significantly 375
large but the degree distributions are extremely imbalanced In biological networks many observations 376
connected large values of centralization and heterogeneity with more hub genes (Ma and Zeng 2003 Horvath 377
and Dong 2008 Iancu et al 2012 Scott-Boyer et al 2013) even though theoretically we cannot rule out the 378
possibility that high values were result from extremely imbalanced degree distribution For the MS network 379
most highly connected genes interacted with a large number of lowly connected genes this pattern is also 380
apparent reflected in the decreasing neighborhood connectivity distribution for the MS network (Supplemental 381
Fig 8) The genes with the most interactions are expected to act as key components in GCN networks 382
(Langfelder and Horvath 2008 Allen et al 2012) and likely represent central regulators of multi-protein 383
biological processes (Ma et al 2013 Du et al 2015) The top 1000 interacting genes from all networks were 384
analyzed in more detail as these were potential ldquohubrdquo genes that may regulate other expression patterns and 385
processes PA and SA shared 95 of the top 1000 interacting genes while MS had 835 unique genes (Fig 386
7A) 148 genes were shared among all three networks (Supplemental Table S5) making these genes strong 387
candidate for central biological regulators The annotation of these genes suggests their participation in a 388
range of basic cellular process (Fig 7C) including gene expression DNA replication translation and gene 389
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 9
CSC method CPM and RPKM normalization methods had higher AUROC values than VST (Fig 2C) Using 285
two models of ARACNE (additive-AA and multiplicative-MA) the co-expression matrices contain less than 05 286
non-zero values for all comparisons and so these techniques were not included in any additional analyses 287
In conclusion our results indicated the widely-used correlation methods resulted in a more predictive maize 288
GCN from a single expression matrix but co-expression with some individual genes may be better detected 289
using MI methods Normalization method did not have a substantial influence on GCNrsquos performance so only 290
CPM normalization was used in conjunction with PCC SCC MRNT and CLR inference for subsequent 291
optimization of other parameters 292
293
Increase Sample Size Had a Positive Effect On GCN 294
GCN analysis can be accomplished with a variable number of samples and datasets but sample size can 295
influence the quality of the resulting GCN (Wei et al 2004 Ballouz et al 2015) Separate analyses were 296
conducted with different numbers of samples and experiments to empirically determine the effect of sample 297
number on GCN effectiveness The data in our analysis consisted of 17 experiments each including between 298
12 and 404 libraries For this analysis CPM normalization method followed by each of four inference methods 299
(PCC SCC MRNET and CLR) was applied to the 17 experiments and the 68 resulting networks were 300
evaluated by both GO and PPPTY 301
From GO and PPPTY evaluation all algorithms exhibit a positive linear relationship between sample size with 302
natural logarithm transformed and average AUROC values (Fig 4) The linear relationships are stronger in 303
PCC and SCC methods with higher r-square values indicating correlation methods benefit more from 304
increasing sample size Thus for building correlation-based GCNs as many samples as possible should be 305
included We also found that as seen for the total GCN analysis PCC and SCC had higher average AUROC 306
values than the MRNET and CLR methods for PPPTY and GO analysis for most of individual networks (Fig 5) 307
308
Ranked Aggregation of Networks Improved Performance of GCNs 309
Ranked aggregation for meta-analysis can also be modified to change the outcomes of GCN by buffering the 310
effect of sample heterogeneity (Zhong et al 2014 Wang et al 2015a Asnicar et al 2016) Aggregated rank 311
standardized correlationMI matrices were calculated from separate experiments to determine if this approach 312
enhanced GCN performance Aggregating individual networks together for meta-analysis can help to highlight 313
true co-expression interactions and reduce noise (Zhong et al 2014 Wang et al 2015a Wang et al 2015b) 314
This analysis was conducted with the 17 differently sized experiments using PCC SCC MRNET and CLR 315
method for GCN inference as we did previously resulting in 68 single GCNs The 17 experiments were 316
aggregated for PCC SCC MRNET and CLR individually and evaluated by GO and PPPTY datasets 317
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 10
Of the 4 aggregated networks that were evaluated the two correlation methods (PCC and SCC) had higher 318
AUROC values than the single network from 1266 samples (Figure 6 and Supplemental Fig 6) However this 319
aggregation strategy did not result in significant higher AUROC scores for the MRNET and CLR method 320
networks compared with single networks with 1266 samples (two-tail Wilcoxon rank test for GO evaluation p-321
values 0494 and 0796) It has been reported that MI estimation accuracy is dependent on sample size (Gao 322
et al 2015) therefore individual MI networks built with a small number of libraries may not demonstrate 323
improved accuracy from aggregation In conclusion the PCCSCC-built GCN performed best using a ranked 324
aggregation strategy and use of this strategy in combination with the other optimized parameters creates a 325
robust GCN 326
327
The Performance of Protein Networks Did Not Exceed Aggregation Networks 328
In many cases mRNA levels in a cell are of interest because mRNA level is thought to be related to the level 329
and function of a protein of interest However many researchers had found inconsistencies between mRNA 330
and protein level (Baerenfaller et al 2008 Schwanhaumlusser et al 2011 Ponnala et al 2014 Walley et al 331
2016) Although relatively less protein expression data is available this data is amenable to GCN construction 332
and could represent a more direct reflection of interacting proteins Using a non-modified protein expression 333
atlas from 23 maize tissues based upon mass spectrometry data (Walley et al 2016) four protein networks 334
were built with PCC SCC MRNET and CLR separately and then evaluated using the same PPPTY and GO 335
dataset as previously mentioned 336
GCNs constructed from protein expression did not exhibit superior AUROC values to those observed for RNA-337
Seq based GCN using the aggregation strategy (Fig 6) When evaluated by GO and PPPTY dataset the 338
performance of the protein network was lower than the aggregated network as well as the single network from 339
1266 samples To confirm this result a two-way ANOVA was computed with pairwise comparison for the GO 340
evaluation which showed that the effect of network type was significant (Supplemental Table S3) A 341
subsequent pairwise comparison using Wilcoxon rank sum test indicated that PCCSCC method were 342
significantly better than MRNETCLR (Supplemental Table S3) although MI methods may be superior for 343
some types of interactions 344
The raw protein expression data included 17862 genes of which 11429 genes overlapped with our RNA-Seq-345
based network and were therefore used for the analysis To demonstrate that the performance of the protein 346
network was not biased due by the selection of genes the PCC method was used for the whole 17862 genes 347
to construct a protein network (Supplemental Fig 7) No improvement could be detected from protein network 348
derived from 17862 genes with p-value equals to 0635 for GO evaluation and 0995 for PPPTY evaluation 349
from one-sided Wilcoxon rank sum test 350
351
PCC and SCC-built GCN Exhibit Identical Topological and Functional Properties 352 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 11
In addition to evaluation of network performance based upon biological characteristics networks can be 353
compared based upon several different network characteristics including clustering coefficient number of 354
nodes network heterogeneity (Dong and Horvath 2007) network centralization (Dong and Horvath 2007) 355
number of detected modules and number of genes in largest module Number of nodes is a basic construct in 356
graph theory depicting the scale of a network Clustering coefficient and number of modules are to model how 357
densely nodes are connected in networks Heterogeneity measures the variability of node connections 358
Centralization indicates how likely some nodes have significantly more connections than average In this 359
analysis each gene corresponds with a node Based on the extensive evaluation using biological 360
characteristics like protein-protein interactions (PPPTY) and predicted gene function (GO) three final maize 361
networks were selected for comparison of basic network characteristics based on their overall performance 362
PCC and SCC-built ranked aggregation network from 17 experiments (PA and SA) MRNET-built single 363
network from 1266 total samples (MS) The three networks were constrained to include the top one million 364
predicted interactions or edges 365
In prior studies most biological networks had scale-free architectures which fit a power-law distribution 366
(Barabasi et al 2004 Doncheva et al 2012 Schaefer et al 2014) For the three final maize networks 367
constructed using optimized parameters both neighborhood connectivity distribution (Supplemental Fig 8) and 368
node degree distribution (Supplemental Fig 9) fit power-law models with r-squared values over 07 The MS 369
network had the highest network centralization value The network heterogeneity value of MS was over two 370
times that of PA and SA indicating that MS may contain more highly interacting genes (Supplemental Table 371
S4) consistent with the observed highest centralization values for this network Centralization and 372
heterogeneity are two variants to model the degree distribution of networks A scale-free network with more 373
numbers of hubs has larger values of centralization and heterogeneity while a network with larger values of 374
centralization and heterogeneity may contain a larger number of hubs or the number of hubs is not significantly 375
large but the degree distributions are extremely imbalanced In biological networks many observations 376
connected large values of centralization and heterogeneity with more hub genes (Ma and Zeng 2003 Horvath 377
and Dong 2008 Iancu et al 2012 Scott-Boyer et al 2013) even though theoretically we cannot rule out the 378
possibility that high values were result from extremely imbalanced degree distribution For the MS network 379
most highly connected genes interacted with a large number of lowly connected genes this pattern is also 380
apparent reflected in the decreasing neighborhood connectivity distribution for the MS network (Supplemental 381
Fig 8) The genes with the most interactions are expected to act as key components in GCN networks 382
(Langfelder and Horvath 2008 Allen et al 2012) and likely represent central regulators of multi-protein 383
biological processes (Ma et al 2013 Du et al 2015) The top 1000 interacting genes from all networks were 384
analyzed in more detail as these were potential ldquohubrdquo genes that may regulate other expression patterns and 385
processes PA and SA shared 95 of the top 1000 interacting genes while MS had 835 unique genes (Fig 386
7A) 148 genes were shared among all three networks (Supplemental Table S5) making these genes strong 387
candidate for central biological regulators The annotation of these genes suggests their participation in a 388
range of basic cellular process (Fig 7C) including gene expression DNA replication translation and gene 389
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 10
Of the 4 aggregated networks that were evaluated the two correlation methods (PCC and SCC) had higher 318
AUROC values than the single network from 1266 samples (Figure 6 and Supplemental Fig 6) However this 319
aggregation strategy did not result in significant higher AUROC scores for the MRNET and CLR method 320
networks compared with single networks with 1266 samples (two-tail Wilcoxon rank test for GO evaluation p-321
values 0494 and 0796) It has been reported that MI estimation accuracy is dependent on sample size (Gao 322
et al 2015) therefore individual MI networks built with a small number of libraries may not demonstrate 323
improved accuracy from aggregation In conclusion the PCCSCC-built GCN performed best using a ranked 324
aggregation strategy and use of this strategy in combination with the other optimized parameters creates a 325
robust GCN 326
327
The Performance of Protein Networks Did Not Exceed Aggregation Networks 328
In many cases mRNA levels in a cell are of interest because mRNA level is thought to be related to the level 329
and function of a protein of interest However many researchers had found inconsistencies between mRNA 330
and protein level (Baerenfaller et al 2008 Schwanhaumlusser et al 2011 Ponnala et al 2014 Walley et al 331
2016) Although relatively less protein expression data is available this data is amenable to GCN construction 332
and could represent a more direct reflection of interacting proteins Using a non-modified protein expression 333
atlas from 23 maize tissues based upon mass spectrometry data (Walley et al 2016) four protein networks 334
were built with PCC SCC MRNET and CLR separately and then evaluated using the same PPPTY and GO 335
dataset as previously mentioned 336
GCNs constructed from protein expression did not exhibit superior AUROC values to those observed for RNA-337
Seq based GCN using the aggregation strategy (Fig 6) When evaluated by GO and PPPTY dataset the 338
performance of the protein network was lower than the aggregated network as well as the single network from 339
1266 samples To confirm this result a two-way ANOVA was computed with pairwise comparison for the GO 340
evaluation which showed that the effect of network type was significant (Supplemental Table S3) A 341
subsequent pairwise comparison using Wilcoxon rank sum test indicated that PCCSCC method were 342
significantly better than MRNETCLR (Supplemental Table S3) although MI methods may be superior for 343
some types of interactions 344
The raw protein expression data included 17862 genes of which 11429 genes overlapped with our RNA-Seq-345
based network and were therefore used for the analysis To demonstrate that the performance of the protein 346
network was not biased due by the selection of genes the PCC method was used for the whole 17862 genes 347
to construct a protein network (Supplemental Fig 7) No improvement could be detected from protein network 348
derived from 17862 genes with p-value equals to 0635 for GO evaluation and 0995 for PPPTY evaluation 349
from one-sided Wilcoxon rank sum test 350
351
PCC and SCC-built GCN Exhibit Identical Topological and Functional Properties 352 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 11
In addition to evaluation of network performance based upon biological characteristics networks can be 353
compared based upon several different network characteristics including clustering coefficient number of 354
nodes network heterogeneity (Dong and Horvath 2007) network centralization (Dong and Horvath 2007) 355
number of detected modules and number of genes in largest module Number of nodes is a basic construct in 356
graph theory depicting the scale of a network Clustering coefficient and number of modules are to model how 357
densely nodes are connected in networks Heterogeneity measures the variability of node connections 358
Centralization indicates how likely some nodes have significantly more connections than average In this 359
analysis each gene corresponds with a node Based on the extensive evaluation using biological 360
characteristics like protein-protein interactions (PPPTY) and predicted gene function (GO) three final maize 361
networks were selected for comparison of basic network characteristics based on their overall performance 362
PCC and SCC-built ranked aggregation network from 17 experiments (PA and SA) MRNET-built single 363
network from 1266 total samples (MS) The three networks were constrained to include the top one million 364
predicted interactions or edges 365
In prior studies most biological networks had scale-free architectures which fit a power-law distribution 366
(Barabasi et al 2004 Doncheva et al 2012 Schaefer et al 2014) For the three final maize networks 367
constructed using optimized parameters both neighborhood connectivity distribution (Supplemental Fig 8) and 368
node degree distribution (Supplemental Fig 9) fit power-law models with r-squared values over 07 The MS 369
network had the highest network centralization value The network heterogeneity value of MS was over two 370
times that of PA and SA indicating that MS may contain more highly interacting genes (Supplemental Table 371
S4) consistent with the observed highest centralization values for this network Centralization and 372
heterogeneity are two variants to model the degree distribution of networks A scale-free network with more 373
numbers of hubs has larger values of centralization and heterogeneity while a network with larger values of 374
centralization and heterogeneity may contain a larger number of hubs or the number of hubs is not significantly 375
large but the degree distributions are extremely imbalanced In biological networks many observations 376
connected large values of centralization and heterogeneity with more hub genes (Ma and Zeng 2003 Horvath 377
and Dong 2008 Iancu et al 2012 Scott-Boyer et al 2013) even though theoretically we cannot rule out the 378
possibility that high values were result from extremely imbalanced degree distribution For the MS network 379
most highly connected genes interacted with a large number of lowly connected genes this pattern is also 380
apparent reflected in the decreasing neighborhood connectivity distribution for the MS network (Supplemental 381
Fig 8) The genes with the most interactions are expected to act as key components in GCN networks 382
(Langfelder and Horvath 2008 Allen et al 2012) and likely represent central regulators of multi-protein 383
biological processes (Ma et al 2013 Du et al 2015) The top 1000 interacting genes from all networks were 384
analyzed in more detail as these were potential ldquohubrdquo genes that may regulate other expression patterns and 385
processes PA and SA shared 95 of the top 1000 interacting genes while MS had 835 unique genes (Fig 386
7A) 148 genes were shared among all three networks (Supplemental Table S5) making these genes strong 387
candidate for central biological regulators The annotation of these genes suggests their participation in a 388
range of basic cellular process (Fig 7C) including gene expression DNA replication translation and gene 389
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 11
In addition to evaluation of network performance based upon biological characteristics networks can be 353
compared based upon several different network characteristics including clustering coefficient number of 354
nodes network heterogeneity (Dong and Horvath 2007) network centralization (Dong and Horvath 2007) 355
number of detected modules and number of genes in largest module Number of nodes is a basic construct in 356
graph theory depicting the scale of a network Clustering coefficient and number of modules are to model how 357
densely nodes are connected in networks Heterogeneity measures the variability of node connections 358
Centralization indicates how likely some nodes have significantly more connections than average In this 359
analysis each gene corresponds with a node Based on the extensive evaluation using biological 360
characteristics like protein-protein interactions (PPPTY) and predicted gene function (GO) three final maize 361
networks were selected for comparison of basic network characteristics based on their overall performance 362
PCC and SCC-built ranked aggregation network from 17 experiments (PA and SA) MRNET-built single 363
network from 1266 total samples (MS) The three networks were constrained to include the top one million 364
predicted interactions or edges 365
In prior studies most biological networks had scale-free architectures which fit a power-law distribution 366
(Barabasi et al 2004 Doncheva et al 2012 Schaefer et al 2014) For the three final maize networks 367
constructed using optimized parameters both neighborhood connectivity distribution (Supplemental Fig 8) and 368
node degree distribution (Supplemental Fig 9) fit power-law models with r-squared values over 07 The MS 369
network had the highest network centralization value The network heterogeneity value of MS was over two 370
times that of PA and SA indicating that MS may contain more highly interacting genes (Supplemental Table 371
S4) consistent with the observed highest centralization values for this network Centralization and 372
heterogeneity are two variants to model the degree distribution of networks A scale-free network with more 373
numbers of hubs has larger values of centralization and heterogeneity while a network with larger values of 374
centralization and heterogeneity may contain a larger number of hubs or the number of hubs is not significantly 375
large but the degree distributions are extremely imbalanced In biological networks many observations 376
connected large values of centralization and heterogeneity with more hub genes (Ma and Zeng 2003 Horvath 377
and Dong 2008 Iancu et al 2012 Scott-Boyer et al 2013) even though theoretically we cannot rule out the 378
possibility that high values were result from extremely imbalanced degree distribution For the MS network 379
most highly connected genes interacted with a large number of lowly connected genes this pattern is also 380
apparent reflected in the decreasing neighborhood connectivity distribution for the MS network (Supplemental 381
Fig 8) The genes with the most interactions are expected to act as key components in GCN networks 382
(Langfelder and Horvath 2008 Allen et al 2012) and likely represent central regulators of multi-protein 383
biological processes (Ma et al 2013 Du et al 2015) The top 1000 interacting genes from all networks were 384
analyzed in more detail as these were potential ldquohubrdquo genes that may regulate other expression patterns and 385
processes PA and SA shared 95 of the top 1000 interacting genes while MS had 835 unique genes (Fig 386
7A) 148 genes were shared among all three networks (Supplemental Table S5) making these genes strong 387
candidate for central biological regulators The annotation of these genes suggests their participation in a 388
range of basic cellular process (Fig 7C) including gene expression DNA replication translation and gene 389
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 12
silencing (Supplemental Table S5) the top interacting genes were not limited to a subset of cellular 390
biochemistry Ribosomal proteins were the largest component of top interacting genes (27148) which was 391
expected because of their cellular abundance and involvement with translation Interestingly nine epigenetic 392
regulators were found in the 148 shared genes including AGO104 (GRMZM2G141818) (Singh et al 2011) 393
CHR106 (GRMZM2G071025) (Li et al 2014a) and LBL1 (GRMZM2G020187) (Dotto et al 2014) 394
demonstrating the importance of epigenetic regulation for plant development (reviewed by (Huang et al 395
2017)) 396
To reveal the underlying properties of GCNs a graph clustering algorithm Markov Cluster Algorithm(MCL) was 397
used to identify network modules (Enright et al 2002 Morris et al 2011) The result showed a shared pattern 398
between the PA and SA networks that was distinct from the MS network (Supplemental Table S4) The MS 399
network had fewer but larger modules detected than the PA and SA networks Consequently most genes in 400
the MS network clustered into one very large module of 14054 consistent with the high network centralization 401
value for the MS network Conversely PA and SA networks separated into smaller distinct modules with 402
related gene ontology enrichment (Supplemental Table S6 and S7) The pattern displayed by the PA and SA 403
networks (Supplemental Fig 10) seems more likely to represent biologically relevant pathways and so these 404
methods appear to be better for module detection 405
To compile a high-confident co-expression network the top 1 million edges from PA SA and MS were merged 406
together and the intersection of the three produced a 14277 gene 106591 interactions merged network PA 407
and SA shared 835 of common interactions within the networks while MS had 873 unique interactions 408
(Fig 7B) This merged network (Supplemental Dataset S1) was used for a case study analysis of cell wall 409
biosynthesis The same network can also be accessed at httpwwwbiofsuedumcginnislabmcnmain_pagephp 410
411
Case Study Cell Wall Biosynthesis and Regulation 412
To demonstrate the functionality of network the predicted cell wall biosynthesis pathway from the merged 413
network was compared to the existing knowledge of this pathway Sixteen well-characterized components of 414
cell wall biosynthesis were selected as guide genes (Supplemental Table S8) including five cellulose 415
synthase genes seven cellulose synthase-like genes three glycosyl hydrolase genes and one glycosidase 416
gene (Penning et al 2009 Bosch et al 2011) Collectively 214 genes containing 377 edges were extracted 417
from the network with the 16 guide genes (Fig 8 A) two guide genes did not have any co-expressed genes in 418
the network that met the analysis criteria As expected for these 214 genes cell wall related GO terms were 419
enriched (Fig 7D Supplemental Table S9) 420
The resulting 214 co-expressed genes were queried against the Arabidopsis TAIR 10 protein database to 421
retrieve homologs and their annotations using BLASTP The literature was manually searched using the maize 422
genes and their Arabidopsis homologs as queries (Supplemental Table S10) The results of the literature 423
survey showed that 313 (67214) of the genes co-expressed with the guide genes had peer-reviewed 424
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 13
publications indicating a role in cell wall synthesis or related pathways in plants A search using 214 randomly 425
selected genes as queries returned only 327 genes (7214) that were involved in cell wall related pathways 426
This suggests that the network discriminated co-expressed genes and identified some known components of 427
the pathway Lignin biosynthesis genes are expected to function in cell wall biosynthesis to provide rigidity and 428
strength in the secondary cell wall (reviewed by Vanholme et al 2010) Interestingly even though no lignin 429
biosynthesis genes were included in our queries six lignin biosynthesis genes (PAL1 C4H 4CL2 HCT 430
CCoAOMT1 and PDR1) (reviewed by Zhong and Ye 2015) were found to be co-expressed with the guide 431
genes At least nine cellulose biosynthesis and assembly genes were discovered including CESA1 FLA11 432
IRX9 IRX14 and IRX10 (reviewed by Zhong and Ye 2015) Moreover proteins participating in a well-studied 433
physical interaction CSI1 (Cellulose Synthase Interactive 1) CESA6 (Cellulose Synthase 6) and CESA3 434
(Cellulose Synthase 3) (Desprez et al 2007 Gu et al 2010) were also predicted to be expressed in the 435
network There were 131 genes without reported functions in cell wall pathways an indication that GCN 436
analysis can be used to predict undiscovered components of biological pathways in maize 437
The cell wall biosynthesis pathway results were also compared with the CORNET Co-expression database (De 438
Bodt et al 2012) and STRING functional protein association network (Szklarczyk et al 2015) using the same 439
16 genes and similar parameters (See Methods) From CORNET 10 out of 16 genes had co-expressed genes 440
(Fig 8B) In total 210 genes and 325 interactions were retrieved using CORNET of which 19 (40210) had 441
publications supporting their function in cell wall pathways (Supplemental Table S11) STRING performed very 442
well with 14 out of 16 genes demonstrating predicted protein association (Fig 8C) resulting in 817 443
interactions with 76 genes 48 (3675) of co-expressed genes were experimentally confirmed (Supplemental 444
Table S12) the highest percentage among the three methods Only one of the lignin biosynthesis genes 445
(PAL1) was found using CORNET and none were found using STRING Although STRING appears very 446
robust for predicting protein-protein interactions this suggests that an optimized GCN analysis have more 447
power to find genes that function together without physically interacting This case study shows that a robust 448
optimized GCN can discover physical and functional interactions and enhance study of biological relevant 449
interactions A tutorial was provided as supplemental material on how to use Cytoscape to visualize any co-450
expressed genes in our network (Supplemental Dataset S2) 451
452
Discussion 453
As the per-read cost of RNA-Seq technology decreases the use of this technology is quickly increasing With 454
over five thousand libraries available for maize there is now ample data to support GCN analysis This 455
comprehensive evaluation of normalization methods and network inference methods using real maize RNA-456
Seq data will provide a useful set of optimized parameters to support these analyses 457
In our analysis VST CPM and RPKM normalization methods had equivalent outcomes for GCN analysis 458
consistent with prior results using much smaller datasets (Giorgi et al 2013) Several benchmark studies 459
focusing on differential expression (DE) analysis proposed that RPKM performed poorly and should be avoided 460 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 14
(Maza et al 2013 Dillies et al 2013b Zyprych-Walczak et al 2015) This was not observed for the maize 461
GCN testing It is possible that the large number of samples from various labs created enough heterogeneity 462
within samples that normalization effects were minimized (Paulson et al 2016) Furthermore the 463
normalization is on a library basis which means genes within the same library are normalized by similar factors 464
So when the network is constructed by PCC and BIC where expression vectors are centered by mean or 465
median values the effect of different normalization methods are probably small Two rank correlations SCC 466
and KCC only consider difference on relative rankings where normalization has a limited effect It is similar for 467
GCC method The estimation of mutual information is based on the k-nearest neighbor method implemented in 468
parmigene (Sales and Romualdi 2011) Since the three normalization methods shared similar expression 469
distribution (Supplemental Fig 2) MI estimations from different normalizations are expected to be similar 470
When assessing inference methods the simple and widely used correlation methods like PCC and SCC are 471
less time-consuming than MI methods This analysis showed PCCSCC- built GCNs had better overall 472
performance This is consistent with a study in human GCN analysis (Ballouz et al 2015) but SCC did not 473
score higher than other correlation methods using GO and PPPTY evaluations Some genes had higher 474
performance using MI methods but this effect was limited to evaluation with the PPPTY data This may 475
indicate that correlation and MI inference methods assert different kinds of interactions (Meyer et al 2008 476
Marbach et al 2012 Song et al 2012) Marbach et al (2012) stated that integration of multiple inference 477
methods showed a more robust performance than any single inference methods in in silico and E coli 478
expression networks referring to ldquothe wisdom of crowdrdquo However for analysis of the available maize data 479
integration of PCC SCC MRNET and CLR together did not result in a network that outperformed PCC and 480
SCC networks (data not shown) This approach was also less effective in more complex S cerevisiae datasets 481
than prokaryotic networks (Marbach et al 2012) suggesting that more work is required to determine whether 482
integrating algorithms can improve GCNs with eukaryotic data 483
In conclusion we extensively evaluated normalization methods and inference methods for building an RNA-484
Seq based maize GCN This optimization may apply to a range of datasets with shared characteristics of 485
maize including a large and heterogeneous genome with rich and diverse transposon element composition 486
and limited gene annotation 487
488
Materials and Methods 489
RNA-Seq Data Collection and Process 490
The maize genome and its annotation were downloaded from Ensembl Plant Release 31 491
(httpplantsensemblorg) The original 1303 RNA-Seq samples based on illumina HiSeq2000 or Hiseq2500 492
were downloaded from NCBI Sequence Read Archive (SRA) (Leinonen et al 2010) The downloaded files 493
were converted to fastq format using the fastq-dump command in SRA Toolkit (version 252) The adapters for 494
the fastq files were trimmed by Cutadapt 181 (Martin 2011) The adapter-removed files were then quality 495
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 15
checked by FastQC v0112 (httpwwwbioinformaticsbabrahamacukprojectsfastqc) HISAT2 v204 (Kim 496
et al 2015) was used for genome alignment Gene-level expression raw read counts were calculated by 497
FeatureCounts 150 (Liao et al 2014) from aligned bam files (Supplemental Fig S1) 26 libraries with less 498
than 5 million reads total and 11 libraries with less than 70 of total alignment rate were excluded leaving 499
1266 samples (Supplemental Table S1) for the final expression table The processing protocol were 500
streamlined by Snakemake v371 (Koumlster and Rahmann 2012) 501
502
Gene Count Normalization 503
The expression data was normalized using three different methods before constructing GCNs Counts Per 504
Million (CPM) and Reads Per Killbase Per Million (RPKM) were calculated by edgeR package (Robinson et al 505
2010) in R environment and then log2 normalized (expression = log2(CPMRPKM +1) For both method scale 506
factors between samples were estimated by Trimmed Mean of M-values (TMM) in edge R Variance Stabilizing 507
Transformation (VST) was calculated by DESeq2 package (Love et al 2014) Only genes with expression 508
higher than 2 CPM in more than 1000 samples were included from additional analysis (15116 genes) 509
510
Network Inference 511
Six correlation coefficient methods and four mutual information methods were applied to normalized gene 512
expression data to construct GCNs All computing steps were done in the R 331 environment Pearson 513
Correlation Coefficient (PCC) and Spearman Correlation Coefficient (SCC) was calculated by cor() function 514
Kendall rank Correlation Coefficient was calculated using corfk() function in pcaPP package (Filzmoser et al 515
2009) Gini Correlation Coefficient was calculated by adjacencymatrix() function in rsgcc package (Ma and 516
Wang 2012) Biweight midcorrelation was computed by bicor() function in WGCNA package (Langfelder and 517
Horvath 2008) Cosine similarity coefficient was computed by cosine() function in coop package (Schmidt 518
2016) Mutual information results were computed using the parmigene package (Sales and Romualdi 2011) 519
The adjacency matrix weighs derived from ten inference methods were ranked with smallest value equals to 520
one Then ranks were divided by the number of elements in the matrix and diagonal was set to one to make all 521
networks weighs ranging from zero to one 522
523
Network Performance Evaluation 524
To generate the random networks gene IDs were shuffled randomly in CPM or VST normalized expression 525
matrices The randomized expression matrices were then inferenced by PCC MRNET or CLR methods and 526
evaluated For PCC methods 1000 repeats of randomization and evaluation were conducted For MRNET and 527
CLR each inference steps took 2 hours on our server so 10 repeats were conducted 528
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 16
Four maize datasets were used for evaluation First maize protein-protein interactions were downloaded from 529
PPIM v11 (Zhu et al 2016) Only high-confidence interactions were used for evaluation as defined by ranking 530
top 5 in their results Second maize pathway information was downloaded from MaizeCyc v22 (Monaco et 531
al 2013) Genes within same pathways were considered as co-expressed Third maize gene ontology data 532
for AGPv330 was downloaded from AgriGO (Du et al 2010) GO terms with 20 to 300 genes were used for 533
evaluation Fourth ChIP-Seq confirmed targets for HDA101 (GRMZM2G172883) (Yang et al 2016) was used 534
as positive co-expressed examples for evaluation 535
The widely-used Area under Receiver Operating Characteristic (AUROC) for binary classification problems 536
was used for evaluations Protein-protein interaction and pathway information was parsed into lists of co-537
expressed genes Prediction() and performance() function in R package ROCR were used to calculate 538
AUROCs (Sing et al 2005) The 277 AUROC values for GO datasets were calculated by EGAD package 539
(Ballouz et al 2016) in R Basically it utilizes the ldquoguilt-by associationrdquo principle that genes with shared GO 540
terms are more likely to connected Thus networks normalized and inferred by different methods can be 541
evaluated by hiding a subset of genes GO terms and test whether the hidden GO terms could be predicted 542
from the remaining annotations The prediction model performance was measured by AUROC values in three-543
fold cross-validation All ANOVA and pairwise Wilcoxon rank tests were analyzed in R using anova() and 544
pairwisewilcoxtest() function from stats package P-value adjustment method was set to ldquofdrrdquo (Benjamini and 545
Hochberg 1995) 546
Definition of True Positives (TP) False Positives (FP) True Negatives (TN) False Negatives (FN) For the 547
evaluation using PPPTY dataset TP a network predicts two genes are co-expressed and they are co-548
expressed in PPPTY dataset FP a network predicts two genes are co-expressed but they are not TN a 549
network predicts two genes are not co-expressed and they are not co-expressed in PPPTY FN a network 550
predicts two genes are not co-expressed but they are co-expressed in PPPTY datasets For the evaluation 551
using GO dataset TP a network predicts a gene has a specific GO term and it does have that GO term in our 552
GO dataset FP a network predicts a gene has a specific GO term but it does not have that GO term in our 553
GO dataset TN a network predicts a gene does not have a specific GO term and it doesnrsquot have in our GO 554
dataset FN a network predicts a gene does not have a specific GO terms but it has that GO term in GO 555
dataset 556
557
Network Clustering and Characterization 558
For each network the top 1 million edges were selected as stringent co-expression networks The network 559
topological characteristics were computed in Cytoscape (Shannon et al 2003) The neighborhood connectivity 560
distribution and node degree distributions were plotted by Network Analyzer plugin (Doncheva et al 2012) 561
Graph clustering was performed using Markov Cluster Algorithm (MCL) by MCL v14137 with inflation value set 562
to 18 (Enright et al 2002) All networks were visualized in Cytoscape 563
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 17
564
Gene Ontology Enrichment and Visualization 565
Gene ontology enrichment was analyzed in AgriGOrsquos Singular Enrichment Analysis tool (Du et al 2010) 566
15116 genes involved in our networks were used as background references Hypergeometric testing was used 567
to calculate p-value for which a value below 005 was considered as significant The Yekutieli method was 568
used for multiple test correction and terms with false discovery rate (FDR) above 005 were discarded The 569
results were then imported into Cytoscape for visualization 570
571
Databases Comparison on Cell Wall Pathway 572
Sixteen well characterized (Penning et al 2009 Bosch et al 2011) components of cell wall biosynthesis 573
(Supplemental Table S8) were chosen as query genes to search against CORNET Maize 574
(httpsbioinformaticspsbugentbecornetversionscornet_maize10) on website and STRING database using 575
Cytoscape stringApp (httpappscytoscapeorgappsstringapp) The parameters for searching CORNET 576
database were Method=Pearson Correlation coefficient=075 P-value le 005 and Top genes = 50 This 577
resulted in 210 co-expressed genes and 325 interactions To search STRING database the confidence cutoff 578
was set to 04 with maximum number of interactors set to 100 76 genes with 817 interactions were retrieved 579
Maize proteins were blasted against TAIR 10 protein sequences using standalone BLASTP version 2228+ 580
(Camacho et al 2009) 581
582
Acknowledgments 583
We would like to give special thanks to Dr Peixiang Zhao (FSU Department of Computer Science) for advice 584
and discussion on topological analysis of maize networks Also we thank Dr Alan Lemmon (FSU Department 585
of Scientific Computing) and Dr Jonathan Dennis (FSU Department of Biological Science) for the helpful 586
discussion on data analysis 587
588
Supplemental Data 589
Supplemental Figure 1 Pipeline and datasets used for analysis 590
Supplemental Figure 2 Distribution of gene expression values 591
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 592
developmental stages 593
Supplemental Figure 4 Pairwise comparison among results of inferences methods 594
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 18
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 595
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) 596
Supplemental Figure 6 Evaluation of network performance based on sample size and inference 597
Supplemental Figure 7 GCN performance comparison between protein networks 598
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 599
SCC-aggregated (SA) and MRNET-single (MS) 600
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 601
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) 602
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) 603
Supplemental Table S1 RNA-Seq libraries used in this analysis 604
Supplemental Table S2 Random network AUROC value baseline 605
Supplemental Table S3 ANOVA tables and pairwise comparisons 606
Supplemental Table S4 Topological characteristics of four maize networks 607
Supplemental Table S5 Gene Ontology annotation for 148 hub genes 608
Supplemental Table S6 Enriched GO terms for PCC ranked aggregation networks from module 1 to module 8 609
Supplemental Table S7 Enriched GO terms for SCC ranked aggregation networks from module 1 to module 8 610
Supplemental Table S8 16 query genes in maize cell wall pathway 611
Supplemetal Table S9 GO enrichment analysis for 214 co-expressed genes of cell wall query genes in 612
merged network 613
Supplemental Table S10 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 614
merged network 615
Supplemental Table S11 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 616
CORNET database 617
Supplemental Table S12 Annotation for co-expressed genes queried by 16 cell wall pathway genes from 618
STRING database 619
Supplemental Dataset S1 The merged network in Cytoscape-ready format 620
Supplemental Dataset S2 Tutorial Visualizing Co-expression data in Cytoscape 621
622
623 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 19
624
625
626
Figure legends 627
628
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) 629
from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene 630
Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and 631
GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray 632
studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify 633
RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B 634
the number of samples submitted to NCBI GEO database each year generated by microarray platform 635
GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq 636
Illumina samples (solid line) per year 2008-2016 637
638
Figure 2 Normalization and network inference methods effect on single network performance A Network 639
performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) 640
values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation 641
(VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance 642
was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using 643
VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from 644
comparisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D 645
Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for 646
samples constructed using ten inference methods including Pearson Correlation Coefficient (PCC) Spearman 647
correlation coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) 648
Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative 649
ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E 650
Network performance was evaluated by calculating AUROC values from comparisons with PPPTY for samples 651
constructed using ten inference methods F Network performance was evaluated by calculating AUROC 652
values from comparisons with HDA101 binding targets for samples constructed using ten inference methods 653
Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile 654
Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest 655
and lowest AUROC values 656
657
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 20
Figure 3 Similarity between ten inference methods on network performance based upon GO (A) and PPPTY 658
(B) evaluation Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box 659
respectively Area under the ROC curve (AUROC) values for each GO term or genes were scaled to standard 660
normal distribution resulting in scaled AUROC values between -3 (blue) and 3 (red) Samples normalized by 661
VST CPM and RPKM were analyzed using each inference methods (PCC SCC KCC GCC BIC CSC AA 662
MA MRNET and CLR) and clustered based on Euclidian distance PCC Pearson Correlation Coefficient SCC 663
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 664
BIC Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 665
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 666
667
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average 668
AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm 669
transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different 670
sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting 671
logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC 672
Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy 673
NETwork CLR Context Likelihood of Relatedness 674
675
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC 676
(black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations 677
of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Seventeen 678
individual networks were labeled as S12_1 to S404 the S1266 included all samples from 17 experiments B 679
Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) 680
libraries were plotted against sample size Networks with the same number of samples included are 681
designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation 682
coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 683
684
Fig 6 GCN performance comparison among single network (whiterdquo1266rdquo) aggregated network (greyrdquoaggrdquo) 685
and protein network (dark greyrdquoprrdquo) using PCC SCC MRNET and CLR A GO evaluation on networks 686
Inference methods were indicated by single letter (p- PCC s- SCC m- MRNET c-CLR) AUROC values were 687
plotted against network types B PPPTY evaluation on networks Inference methods were indicated by single 688
letter (p- PCC s- SCC m- MRNET c-CLR) Network types were plotted against AUROC values Bold 689
horizontal lines indicate median star sign is the mean value of each box Outliers are plotted in grey dots 690
691
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 21
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC 692
curve (AUROC) values from GO evaluation of single network (white bars) aggregation network (grey bars) and 693
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 694
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B 695
AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and 696
protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) 697
or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers 698
699
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram 700
shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among 701
three networks PA PCC ranked aggregation network SA SCC ranked aggregation network MS MRNET 702
single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges 703
were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly 704
interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed 705
genes queried by 16 cell wall pathway genes 706
707
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and 708
MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with 709
reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of 710
involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network 711
retrieved from CORNET database queried by the16 cell wall pathway genes (red node) Cyan nodes are 712
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 713
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C 714
Network retrieved from STRING database queried by 16 cell wall pathway genes (red nodes) Cyan nodes are 715
genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior 716
knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions 717
718
Supplemental Figure 1 Pipeline and datasets used for analysis A Workflow used in this analysis 719
Independent steps are labeled in square boxes with alternative algorithms for each step in the rounded boxes 720
Software and packages for each step are in italics between the boxes Raw data files were acquired from 721
National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database converted to a 722
common format (fastq files) and aligned to the maize AGPv3 genome (Alignment) Gene-level reads were 723
counted (Read Count) to generate an expression matrix which was imported to the R environment for the 724
normalization inference and evaluation steps All networks were visualized in Cytoscape B Relative 725
representation of different maize tissues in acquired datasets Tissues are listed by name with the percentage 726
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 22
of the1266 libraries originating from each tissue SAM= Shoot Apical Meristem Samples are grouped by tissue 727
and may be represented by one or more developmental stages of that tissue Tissues represented by less than 728
10 libraries were grouped together as Others C Relative representation of different maize genotypes in our 729
datasets Genotypes are listed by name with the percentage of the 1266 libraries originating from each tissue 730
MAGIC = Multi-parent Advanced Generation InterCrosses Genotypes represented by more than 10 libraries 731
were grouped together as Others 732
733
Supplemental Figure 2 Distribution of gene expression values The frequency of each expression level in the 734
dataset (Density) was plotted against gene expression (Expr) which was calculated after normalization by 735
Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads Per Kilobase per Million 736
mapped reads (RPKM) A-B distribution of expression values for samples normalized with CPM (black line 737
CPM graph) and RPKM (black line RPKM graph) before (A) and after (B) logarithm normalization (log2) VST 738
values are log2 transformed by default The normal distribution of expression (dot lines) was calculated using 739
dnorm() function in R which takes the mean value and standard deviation from log2 transformed expressions 740
C Normalized gene expression values for 15116 genes were averaged libraries and plotted as a function of 741
gene length in base pairs (bp) 742
743
Supplemental Figure 3 Maize CPM-normalized with log2 transformed gene expression from all tissues and 744
developmental stages (Stelpflug et al 2015) A Clustering dendrogram of samples based on Euclidean 745
distance (Height) DAS days after sowing DAP days after pollination V1-V18 vegetative developmental 746
stage B Heat map of the gene expression correlation between pollen tissue and 78 other tissues calculated 747
by Pearson correlation coefficient ranging 06 to 10 Red color indicates higher correlation 748
749
Supplemental Figure 4 Pairwise comparison among results of inferences methods A GO evaluation 750
comparisons for VST CPM and RPKM normalized data The AUROC value density for each method was 751
plotted in diagonal line of blocks between AUROC values and PCC values AUROC values evaluated by GO 752
datasets were plotted pairwise in triangle below diagonal with the number corresponding coefficient values as 753
calculated by Pearson correlation shown in the triangle above diagonal B PPPTY evaluation comparisons for 754
VST CPM and RPKM normalized data The AUROC value density for each method was plotted in diagonal 755
line of blocks between AUROC values and PCC values AUROC values evaluated by PPPTY datasets were 756
plotted pairwise in triangle below diagonal with the number corresponding coefficient values as calculated by 757
Pearson correlation shown in the triangle above diagonal PCC Pearson Correlation Coefficient SCC 758
Spearman correlation coefficient KCC Kendall rank Correlation Coefficient GCC Gini correlation coefficient 759
Bi Biweight midcorrelation CSC Cosine Similarity Coefficient AA Additive ARCNE MA multiplicative 760
ARCNE MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 761
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 23
762
Supplemental Figure 5 Characteristic of all 1720 PPPTY gene set (ALL_1720) genes with highest AUROC 763
values in CSC method (CSC) PCC method (PCC) and MRNET method (MRNET) Average expression in 764
CPM of four gene sets were in squares average number of lowly expressed elements (CPM lt 0) were in solid 765
circles 766
767
Supplemental Figure 6 Evaluation of network performance based on sample size and inference A AUROC 768
values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted 769
against sample size B AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 770
1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included 771
are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo Outliers were defined as outside of 15 times the interquartile range 772
above the 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines Dash lines 773
are average AUROC value from 17 individual networks of each categories Mean values of each network were 774
labeled in asterisks PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET 775
Minimum Redundancy NETwork CLR Context Likelihood of Relatedness 776
777
Supplemental Figure 7 GCN performance comparison between protein networks A Area Under the ROC 778
curve (AUROC) values from GO evaluation of protein networks with 17862 genes (ppr_all) and with 11429 779
genes (ppr) B Area Under the ROC curve (AUROC) values from PPPTY evaluation of protein networks with 780
17862 genes (ppr_all) and with 11429 genes (ppr) Both networks were constructed by Pearson Correlation 781
Coefficient (PCC) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate 782
outliers 783
784
Supplemental Figure 8 Average neighborhood connectivity for three selected networks PCC-aggregated (PA) 785
SCC-aggregated (SA) and MRNET-single (MS) The average neighborhood connectivity distribution of all 786
genes is plotted against number of neighbors The top one million edges were chosen for each network Red 787
and blue curve shows the power-law fitted distribution R2 value indicates the fitness with the power-law model 788
789
Supplemental Figure 9 Node distribution for four selected networks PCC-aggregated (PA) SCC-aggregated 790
(SA) and MRNET-single (MS) and the intersection among three networks (Merged network) The number of 791
edges linked to the genes (node degree) was plotted against the number of genes with that degree (number of 792
nodes) Red curve shows the power-law fitted distribution with the function and R2 indicated beside 793
794
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 24
Supplemental Figure 10 Network representation of PCC ranked aggregation network (PA) Each node is a 795
gene in the network The eight largest modules detected by Markov Cluster Algorithm (MCL) were highlighted 796
in colors Genes not in modules 1-8 are light grey nodes 797
798
799
Literature Cited 800
Allen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale 801 gene networks PLoS One 7 e29348 802
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106 803
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression 804 networks in plant biology Plant Cell Physiol 48 381ndash90 805
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression 806 Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5ndashe5 807
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) 808 NES2RA Network expansion by stratified variable subsetting and ranking aggregation Int J High Perform 809 Comput Appl 1094342016662508 810
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P 811 Grossniklaus U Gruissem W Baginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana 812 gene models and proteome dynamics Science (80- ) 320 938ndash941 813
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis 814 Safety in numbers Bioinformatics 31 2123ndash2130 815
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 816 53868 817
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cellrsquos functional 818 organization Nat Rev Genet 5 101ndash113 819
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to 820 multiple testing J R Stat Soc Ser B 289ndash300 821
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant 822 coexpression protein-protein interactions regulatory interactions gene associations and functional 823 annotations New Phytol 195 707ndash720 824
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OrsquoConnor D Grotewold E Hake S (2012) Unraveling the 825 KNOTTED1 regulatory network in maize meristems Genes Dev 26 1685ndash90 826
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in 827 grasses by differential gene expression profiling of elongating and non-elongating maize internodes J 828 Exp Bot 62 3545ndash3561 829
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ 830 architecture and applications BMC Bioinformatics 10 421 831
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szcześniak MW Gaffney DJ 832 Elo LL Zhang X et al (2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13 833
Drsquohaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse 834 engineering Bioinformatics 16 707ndash726 835
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 25
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM 836 Jiang N et al (2011) Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant 837 Genome J 4 191 838
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) 839 Organization of cellulose synthase complexes involved in primary cell wall synthesis in Arabidopsis 840 thaliana Proc Natl Acad Sci 104 15572ndash15577 841
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 842 42 143ndash175 843
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D 844 Estelle J (2013a) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA 845 sequencing data analysis Brief Bioinform 14 671ndash683 846
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D 847 Estelle J et al (2013b) A comprehensive evaluation of normalization methods for Illumina high-throughput 848 RNA sequencing data analysis Brief Bioinform 14 671ndash683 849
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization 850 of biological networks and protein structures Nature Protoc 7 670ndash85 851
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24 852
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis 853 of leafbladeless1-regulated and phased small RNAs underscores the importance of the TAS3 ta-siRNA 854 pathway to maize development PLoS Genet 10 e1004826 855
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray 856 data using random matrix theory Hortic Res 2 15026 857
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community 858 Nucleic Acids Res 38 64-70 859
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein 860 families Nucleic Acids Res 30 1575ndash1584 861
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C 862 Prasad RB (2014) Global genomic and transcriptomic analysis of human pancreatic islets reveals novel 863 genes influencing glucose metabolism Proc Natl Acad Sci 111 13924ndash13929 864
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) 865 Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of 866 expression profiles PLoS Biol 5 0054ndash0066 867
Fedoroff N V (2012) McClintockrsquos challenge in the 21st century Proc Natl Acad Sci 109(50) 20200ndash20203 868
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules 869 between two grass species maize and rice Plant Physiol 156 1244ndash56 870
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1 871
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing 872 reveals the complex regulatory network in the maize kernel Nature Commun 42832 873
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent 874 Variables Artificial Intelligence and Statistics 277-286 875
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function 876 Bioinformatics 27 1860ndash1866 877
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression 878 networks in Arabidopsis thaliana Bioinformatics 2 1ndash8 879
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 26
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR 880 (2010) Identification of a cellulose synthase-associated protein required for cellulose biosynthesis Proc 881 Natl Acad Sci 107 12866ndash12871 882
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges 883 Bioinform Biol Insights 9 29ndash46 884
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 885 4 e1000117 886
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene 887 Expression in Maize Int Rev Cell Mol Biol 328 25ndash48 888
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de 889 novo coexpression network inference Bioinformatics 28 1592ndash1597 890
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat 891 Methods 12 357ndash360 892
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 893 2520ndash2522 894
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning 895 causality from time and perturbation Genome Biol 14 123 896
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and 897 divergence times Mol Biol Evol 34 1812ndash1819 898
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene 899 association methods for coexpression network construction and biological knowledge discovery PLoS 900 One 7 e50411 901
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC 902 Bioinformatics 9 559 903
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019 904
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide 905 Characterization of cis-Acting DNA Targets Reveals the Transcriptional Regulatory Framework of 906 Opaque2 in Maize Plant Cell 27 532-545 907
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide 908 association study dissects the genetic architecture of oil biosynthesis in maize kernels Nat Genet 45 43ndash909 50 910
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High 911 Performance Reverse Engineering Analysis 2013 912
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of 913 Illumina high-throughput RNA-Seq data BMC Bioinformatics 16 347 914
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE 915 Huang J et al (2014a) Genetic Perturbation of the Maize Methylome Plant Cell 26 4602ndash4616 916
Li S Łabaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and 917 correcting systematic variation in large-scale RNA sequencing data Nature Biotechnol 32 888ndash895 918
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and 919 Analysis Trends Plant Sci 20 664ndash675 920
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence 921 reads to genomic features Bioinformatics 30 923ndash930 922
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures 923 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 27
Effects on reverse engineering gene networks Bioinformatics pp 282ndash288 924
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing 925 genes associated with complex agronomic traits in rice Plant J 90 177-188 926
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) 927 The genotype-tissue expression (GTEx) project Nat Genet 45 580ndash585 928
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data 929 with DESeq2 Genome Biol 15 1 930
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome 931 mapping based on collaborative filtering framework Sci Rep 5 7702 932
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in 933 transcriptome analysis Plant Physiol 160 192ndash203 934
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic 935 networks Bioinformatics 19 1423ndash1430 936
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-937 expression networks reveals novel modular expression pattern and new signaling pathways PLoS Genet 938 9 e1003840 939
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR 940 Bonneau R et al (2012) Wisdom of crowds for robust gene network inference Nat Methods 9 796ndash804 941
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE 942 an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context BMC 943 Bioinformatics 7 S7 944
Mark Cigan A Unger‐Wallace E Haug‐Collet K (2005) Transcriptional gene silencing as a tool for uncovering 945 gene function in maize Plant J 43 929ndash940 946
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 947 pp-10 948
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for 949 differential gene expression analysis in RNA-Seq experiments A matter of relative size of studied 950 transcriptomes Commun Integr Biol 6 e25849 951
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792ndash952 801 953
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional 954 regulatory networks Eurasip J Bioinforma Syst Biol doi 101155200779879 955
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional 956 networks using mutual information BMC Bioinformatics 9 461 957
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J 958 Harper L Gardiner J et al (2013) Maize Metabolic Network Construction and Transcriptome Analysis 959 Plant Genome 6 12 960
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A 961 Feller A Carvalho B Emiliani J et al (2012) A genome-wide regulatory framework identifies maize 962 pericarp color1 controlled genes Plant Cell 24 2745ndash64 963
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker 964 a multi-algorithm clustering plugin for Cytoscape BMC Bioinformatics 12 436 965
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian 966 transcriptomes by RNA-Seq Nat Methods 5 621ndash628 967
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 28
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 968 69ndash71 969
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks 970 for Arabidopsis Nucleic Acids Res 37 D987ndashD991 971
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene 972 modules with biological information in plants Bioinformatics 26 1267ndash1268 973
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol 974 Direct 4 14 975
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray 976 data BMC Bioinformatics 4 33 977
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush 978 J (2016) Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data 979 bioRxiv 81802 980
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et 981 al (2015) FASCIATED EAR4 Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in 982 Maize Plant Cell Online 2 tpc114132506 983
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty 984 DR Davis MF et al (2009) Genetic resources for maize cell wall biology Plant Physiol 151 1703ndash1728 985
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing 986 maize leaf Plant J 78 424ndash440 987
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput 988 transcriptome sequencing experiments Bioinformatics 29 2146ndash2152 989
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression 990 analysis of digital gene expression data Bioinformatics 26 139ndash140 991
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene 992 network reconstruction Bioinformatics 27 1876ndash1877 993
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why 994 stability does not indicate accuracy in a sea of changing annotations Database J Biol databases 995 curation 2016 996
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H 997 Nagamura Y (2011) RiceXPro a platform for monitoring gene expression in japonica rice grown under 998 natural field conditions Nucleic Acids Res 39 D1141ndashD1148 999
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize 1000 transcriptomes using COB the co-expression browser PLoS One doi 101371journalpone0099193 1001
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R package 1002
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics 1003 Science (80- ) 326 1112ndash1115 1004
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global 1005 quantification of mammalian gene expression control Nature 473 337ndash342 1006
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-1007 expression modules in mouse crosses Frontiers in Genetics 20134291 1008
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities 1009 and Challenges Front Plant Sci 7 444 1010
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) 1011 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 29
Cytoscape a software environment for integrated models of biomolecular interaction networks Genome 1012 Res 13 2498ndash2504 1013
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R 1014 Bioinformatics 21 3940ndash3941 1015
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of 1016 viable gametes without meiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443ndash458 1017
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information 1018 correlation and model based indices BMC Bioinformatics 13 328 1019
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An 1020 expanded maize gene expression atlas based on RNA-sequencing and its use to explore root 1021 development Plant Genome 314ndash362 1022
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A 1023 Tsafou KP et al (2015) STRING v10 Protein-protein interaction networks integrated over the tree of life 1024 Nucleic Acids Res 43 D447ndashD452 1025
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative 1026 Co-Expression Networks Construction and Visualization Tool Front Plant Sci 6 1194 1027
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S 1028 Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and 1029 caveats Plant Cell Environ 32 1633ndash51 1030
USDA (2016) Grain World Markets and Trade 1031
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant 1032 Physiol 153 895ndash905 1033
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism 1034 and Beyond Nat Rev Mol Cell Biol 16 258ndash264 1035
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR 1036 (2016) Integration of omic networks in a developmental atlas of maize Science 353 814ndash818 1037
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene 1038 ranking BMC Bioinformatics 16 S6 1039
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring genendashgene interactions and functional 1040 modules using sparse canonical correlation analysis Ann Appl Stat 9 300ndash323 1041
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 1042 57ndash63 1043
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray 1044 experiments BMC Genomics 5 87 1045
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability ofldquo guilt-by-associationrdquo 1046 within gene coexpression networks BMC Bioinformatics 6 227 1047
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) ldquoOut of Pollenrdquo Hypothesis for Origin of New 1048 Genes in Flowering Plants Study from Arabidopsis thaliana Genome Biol Evol 6 2822ndash2829 1049
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of 1050 Targets of Maize Histone Deacetylase HDA101 Reveals Its Function and Regulatory Mechanism during 1051 Seed Development Plant Cell 28 629ndash645 1052
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 1053 13 83 1054
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC 1055 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Page | 30
Bioinformatics 12 290 1056
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene 1057 network reconstruction PLoS One 9 e106319 1058
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation 1059 Plant Cell Physiol 56 195ndash214 1060
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein 1061 interaction database for Maize Plant Physiol 170 pp1501821- 1062
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) 1063 The impact of normalization methods on RNA-Seq data analysis Biomed Res Int 2015 1064
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Figure 1 Number of maize microarray and RNA-Seq samples submitted to NCBI (httpswwwncbinlmnihgov) from 2008-2016 A A text search of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database identified samples generated by microarray platforms GPL4032 and GPL12620 the total values for years 2008 to 2016 were combined to represent the number of microarray studies (Microarray) A text search of the NCBI Sequence Read Archive (SRA) database was used to identify RNA-seq samples generated between 2008 and 2016 using the Illumina sequencing platform (RNA-Seq) B the number of samples submitted to NCBI GEO database each year generated by microarray platform GPL4032 and GPL12620 were identified by a text search (dash line) and compared to the number of RNA-Seq Illumina samples (solid line) per year 2008-2016
Fig 1A B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Figure 2 Normalization and network inference methods effect on single network performance A Network performance was evaluated by calculating area under the receiver operating characteristic curve (AUROC) values from GO datasets for comparisons with samples normalized using Variance Stabilizing Transformation (VST) Counts Per Million (CPM) or Reads per Kilobase per Million (RPKM) methods B Network performance was evaluated by calculating AUROC values from PPPTY dataset comparisons for samples normalized using VST CPM or RPKM methods C Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples normalized using VST CPM or RPKM methods D Network performance was evaluated by calculating AUROC values from comparisons with GO dataset for samples constructed using nine inference methods including Pearson Correlation Coefficient (PCC) Spearman correla-tion coefficient (SCC) Kendall rank Correlation Coefficient (KCC) Gini correlation coefficient (GCC) Biweight midcorrelation (BIC) Cosine Similarity Coefficient (CSC) Additive ARCNE (AA) multiplicative ARCNE (MA) Minimum Redundancy NETwork (MRNET) and Context Likelihood of Relatedness (CLR) E Network perfor-mance was evaluated by calculating AUROC values from comparisons with PPPTY for samples constructed using nine inference methods F Network performance was evaluated by calculating AUROC values from com-parisons with HDA101 binding targets for samples constructed using nine inference methods Outliers were defined as outside of 15 times the interquartile range above 75 quantile or below 25 quantile Median values were plotted as bold horizontal lines For C to F the horizontal dashed lines indicate the highest and lowest AUROC values
Fig 2 A D
B E
C F
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
FigP
FigurePIPSimilarityPbetweenPninePinferencePmethodsPonPnetworkPperformancePbasedPuponPGOPA-PandPPPPTYPB-PevaluationI Genes with highest AUROC value in CSC or MRNETCLR are closed in blue and green box respectively AreaPunderPthePROCPcurvePAUROC-PvaluesPforPeachPGOPtermPorPgenesPwerePscaledPtoPstandardPnormalPdistributionMPresultingPinPscaledPAUROCPvaluesPbetweenPKPblue-PandPPred-IPSamplesPnormalizedPbyPVSTMPCPMPandPRPKMPwerePanalyzedPusingPeachPinferencePmethodsPPCCMPSCCMPKCCMPGCCMPBICMPCSCMPAAMPMAMPMRNETPandPCLR-PandPclusteredPbasedPonP EuclidianP distanceIP PCCP PearsonP CorrelationP CoefficientP SCCP SpearmanP correlationP coefficientP KCCP KendallP rankP CorrelationP CoefficientPGCCPGiniPcorrelationPcoefficientPBICPBiweightPmidcorrelationPCosinePSimilarityPCoefficientPCSC-PAAPAdditivePARCNEPMAPmultiplicativePARCNEPMRNETPMinimumPRedundancyPNETworkPCLRPContextPLikelihoodPofPRelatednessI
A
B
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Figure 4 Effect of sample size on network performance A Average area under the ROC curve (average AUROC) values from GO evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) B Average AUROC values from PPPTY evaluation of 17 different sized networks plotted against natural logarithm transformed sample size (log(sample size)) Regression fitting logarithm models were plotted in black lines R-square and p-values were calculated by lm() function in R PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
Fig 4A
B
GO PCC GO SCC
GO MRNET GO CLR
PPPTY PCC PPPTY SCC
PPPTY CLRPPPTY MRNET
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Fig 5
A B
Figure 5 Evaluation of different sized networks constructed with different number of samples using PCC (black) SCC (green) MRNET (red) and CLR (blue) methods A Average AUROC values from GO evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size B Average AUROC values from PPPTY evaluations of networks constructed using 12 (S12) to 1266 (S1266) libraries were plotted against sample size Networks with the same number of samples included are designated as ldquo_1rdquo ldquo_2rdquo and ldquo_3rdquo PCC Pearson Correlation Coefficient SCC Spearman correlation coefficient MRNET Minimum Redundancy NETwork CLR Context Likelihood of Relatedness
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Figure 6 GCN performance comparison of networks constructed with 1266 libraries A Area under the ROC curve (AUROC) values from GO evalu-ation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers B AUROC values from PPPTY evaluation of single network (white bars) aggregation network (grey bars) and protein network (dark grey bars) were compared for network constructed using PCC(p) SCC(s) MRNET(m) or CLR(c) Bold horizontal lines indicate median Asterisks indicate mean and grey dots indicate outliers
AU
C
AU
C
Protein GO Protein PPPTYA B
Fig 6
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
835
45
3812
5 802
148
MS PA
SA
872505
167664
16510411732
9172 716573
106591
MS PA
SA
chromatinassemblydisassembly
cellular macromoleculemetabolic process
chromatin assembly
Hub
N2 compound metabolicprocess
gene silencing
macromoleculemetabolic process
cellular componentorganization
chromatin modification
biosynthetic process
cellular biosyntheticprocess
DNA packaging
organelle organization
protein-DNA complexassembly
nucleosomeorganization
DNA-dep DNAreplication
macromoleculebiosynthetic process
response to DNAdamage stimulus
chromosomeorganization
pattern specificationprocess
DNA replication
DNA conformationchange
translation
cellular macromoleculebiosynthetic process
Nucleic acid metabolicprocess
gene expression
chromatin organizationnucleosome assembly
epigenetic reg of geneexpression
negative regulation ofmacromolecule
metabolic process
cellular response tostress
RNA processing
DNA repair
regionalization
polysaccharidebiosynthetic process
cell wall organization orbiogenesis
glucan metabolicprocess
cellular glucanmetabolic process
cellular polysaccharidebiosynthetic process
cellular carbohydratebiosynthetic process
cellulose metabolicprocess
cellular polysaccharidemetabolic process
cellulose biosyntheticprocess epidermis development
cell growthgrowth
regulation of cellularcomponent size
cellular amino acidderivative metabolic
process
cell wall polysaccharidemetabolic process
carbohydrate metabolicprocess
regulation of anatomicalstructure size
GTP metabolic process
root morphogenesis
epidermal celldifferentiation
ectoderm developmentphenylpropanoid
biosynthetic process
regulation of cell size
glucan biosyntheticprocess
carbohydratebiosynthetic process cellular cell wall
organization orbiogenesis
cell wall biogenesis
Cell Wallroot epidermal cell
differentiationcell differentiation
cell wall organization
protein polymerization
plant-type cell wallbiogenesis
cellular carbohydratemetabolic process
phenylpropanoidmetabolic process
cell wall macromoleculemetabolic process
cellular cell wallmacromolecule
metabolic process
plant-type cell wallorganization or
biogenesis
hemicellulose metabolicprocess
Fig 7A C
B D
Figure 7 Comparison of top 1000 interacting genes in aggregation and single networks A Venn diagram shows the overlap among 1000 highest interacting genes in three networks 148 genes were shared among three networks PA PCC ranked aggregation network SA SCC ranked aggrega-tion network MS MRNET single network B Venn diagram shows the overlap among top 1x106 edges in three networks 106591 edges were shared among three networks C GO term enrichment analyzed by AgriGO for 148 shared highly interacting genes among PA SA and MS D GO term enrichment analyzed by AgriGO for 214 co-expressed genes queried by 16 cell wall pathway genes wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Fig 8
A B C
Figure 8 Cell wall pathway subnetworks A Intersections of PCC aggregation (PA) SCC aggregation (SA) and MRNET-single (MS) networks queried by 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions B Network retrieved from CORNET database queried by the 16 cell wall pathway genes (red node) Cyan nodes are genes with reported function in cell wall related path-ways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions C Network retrieved from STRING database queried by the 16 cell wall pathway genes (red nodes) Cyan nodes are genes with reported function in cell wall related pathways in plant Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways Grey lines indicate network predicted interactions
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Parsed CitationsAllen JD Xie Y Chen M Girard L Xiao G (2012) Comparing statistical methods for constructing large scale gene networks PLoS One7 e29348
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Anders S Huber W (2010) Differential expression analysis for sequence count data Genome Biol 11 R106Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki K Ogata Y Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biologyPlant Cell Physiol 48 381-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Aoki Y Okamura Y Tadaka S Kinoshita K Obayashi T (2015) ATTED-II in 2016 A Plant Coexpression Database Towards Lineage-Specific Coexpression Plant Cell Physiol 57 e5-e5
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Asnicar F Masera L Coller E Gallo C Sella N Tolio T Morettin P Erculiani L Galante F Semeniuta S (2016) NES2RA Networkexpansion by stratified variable subsetting and ranking aggregation Int J High Perform Comput Appl 1094342016662508
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Baerenfaller K Grossmann J Grobei MA Hull R Hirsch-Hoffmann M Yalovsky S Zimmermann P Grossniklaus U Gruissem WBaginsky S (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics Science (80- ) 320938-941
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Verleyen W Gillis J (2015) Guidance for RNA-seq co-expression network construction and analysis Safety in numbersBioinformatics 31 2123-2130
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ballouz S Weber M Pavlidis P Gillis J (2016) EGAD Ultra-fast functional analysis of gene networks bioRxiv 53868Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Barabasi A-L Oltvai ZNZN Barabaacutesi A-L (2004) Network biology understanding the cells functional organization Nat Rev Genet 5101-113
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Benjamini Y Hochberg Y (1995) Controlling the false discovery rate a practical and powerful approach to multiple testing J R Stat SocSer B 289-300
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
De Bodt S Hollunder J Nelissen H Meulemeester N Inzeacute D (2012) CORNET 20 Integrating plant coexpression protein-proteininteractions regulatory interactions gene associations and functional annotations New Phytol 195 707-720
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bolduc N Yilmaz A Mejia-Guerra MK Morohashi K OConnor D Grotewold E Hake S (2012) Unraveling the KNOTTED1 regulatorynetwork in maize meristems Genes Dev 26 1685-90
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Bosch M Mayer C-D Cookson A Donnison IS (2011) Identification of genes involved in cell wall biogenesis in grasses by differential wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
gene expression profiling of elongating and non-elongating maize internodes J Exp Bot 62 3545-3561Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Camacho C Coulouris G Avagyan V Ma N Papadopoulos J Bealer K Madden TL (2009) BLAST+ architecture and applications BMCBioinformatics 10 421
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Conesa A Madrigal P Tarazona S Gomez-Cabrero D Cervera A McPherson A Szczesniak MW Gaffney DJ Elo LL Zhang X et al(2016) A survey of best practices for RNA-seq data analysis Genome Biol 17 13
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhaeseleer P Liang S Somogyi R (2000) Genetic network inference from co-expression clustering to reverse engineeringBioinformatics 16 707-726
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Davidson RM Hansey CN Gowda M Childs KL Lin H Vaillancourt B Sekhon RS de Leon N Kaeppler SM Jiang N et al (2011) Utilityof RNA Sequencing for Analysis of Maize Reproductive Transcriptomes Plant Genome J 4 191
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Desprez T Juraniec M Crowell EF Jouy H Pochylova Z Parcy F Houmlfte H Gonneau M Vernhettes S (2007) Organization of cellulosesynthase complexes involved in primary cell wall synthesis in Arabidopsis thaliana Proc Natl Acad Sci 104 15572-15577
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dhillon IS Modha DS (2001) Concept decompositions for large sparse text data using clustering Mach Learn 42 143-175Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies M-A Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot G Castel D Estelle J (2013a) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dillies MA Rau A Aubert J Hennequet-Antier C Jeanmougin M Servant N Keime C Marot NS Castel D Estelle J et al (2013b) Acomprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis Brief Bioinform 14671-683
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Doncheva NT Assenov Y Domingues FS Albrecht M (2012) Topological analysis and interactive visualization of biological networksand protein structures Nature Protoc 7 670-85
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dong J Horvath S (2007) Understanding network concepts in modules BMC Syst Biol 1 24Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Dotto MC Petsch KA Aukerman MJ Beatty M Hammell M Timmermans MCP (2014) Genome-wide analysis of leafbladeless1-regulatedand phased small RNAs underscores the importance of the TAS3 ta-siRNA pathway to maize development PLoS Genet 10 e1004826
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Du D Rawat N Deng Z Gmitter FG (2015) Construction of citrus gene coexpression networks from microarray data using randommatrix theory Hortic Res 2 15026
Pubmed Author and TitleCrossRef Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Google Scholar Author Only Title Only Author and Title
Du Z Zhou X Ling Y Zhang Z Su Z (2010) agriGO a GO analysis toolkit for the agricultural community Nucleic Acids Res 38 64-70Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Enright AJ Van Dongen S Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families Nucleic Acids Res 301575-1584
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fadista J Vikman P Laakso EO Mollet IG Esguerra J Lou Taneera J Storm P Osmark P Ladenvall C Prasad RB (2014) Globalgenomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism Proc Natl AcadSci 111 13924-13929
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Faith JJ Hayete B Thaden JT Mogno I Wierzbowski J Cottarel G Kasif S Collins JJ Gardner TS (2007) Large-scale mapping andvalidation of Escherichia coli transcriptional regulation from a compendium of expression profiles PLoS Biol 5 0054-0066
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fedoroff N V (2012) McClintocks challenge in the 21st century Proc Natl Acad Sci 109(50) 20200-20203Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ficklin SP Feltus FA (2011) Gene coexpression network alignment and conservation of gene modules between two grass speciesmaize and rice Plant Physiol 156 1244-56
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Filzmoser P Fritz H Kalcher K (2009) pcaPP Robust PCA by Projection Pursuit R Packag version 1Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Fu J Cheng Y Linghu J Yang X Kang L Zhang Z Zhang J He C Du X Peng Z (2013) RNA sequencing reveals the complex regulatorynetwork in the maize kernel Nature Commun 42832
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gao S Ver Steeg G Galstyan A (2015) Efficient Estimation of Mutual Information for Strongly Dependent Variables ArtificialIntelligence and Statistics 277-286
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gillis J Pavlidis P (2011) The role of indirect connections in gene networks in predicting function Bioinformatics 27 1860-1866Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Giorgi F Fabbro C Del Licausi F (2013) Comparative study of RNA-seq-and Microarray-derived coexpression networks in Arabidopsisthaliana Bioinformatics 2 1-8
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Gu Y Kaplinsky N Bringmann M Cobb A Carroll A Sampathkumar A Baskin TI Persson S Somerville CR (2010) Identification of acellulose synthase-associated protein required for cellulose biosynthesis Proc Natl Acad Sci 107 12866-12871
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Han Y Gao S Muegge K Zhang W Zhou B (2015) Advanced applications of RNA sequencing and challenges Bioinform Biol Insights 929-46
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Horvath S Dong J (2008) Geometric interpretation of gene coexpression network analysis PLoS Comput biol 4 e1000117Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Huang J Lynn JS Schulte L Vendramin S McGinnis K (2017) Chapter Two-Epigenetic Control of Gene Expression in Maize Int RevCell Mol Biol 328 25-48
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Iancu OD Kawane S Bottomly D Searles R Hitzemann R McWeeney S (2012) Utilizing RNA-Seq data for de novo coexpressionnetwork inference Bioinformatics 28 1592-1597
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kim D Langmead B Salzberg SL (2015) HISAT a fast spliced aligner with low memory requirements Nat Methods 12 357-360Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Koumlster J Rahmann S (2012) Snakemakemdasha scalable bioinformatics workflow engine Bioinformatics 28 2520-2522Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Krouk G Lingeman J Colon A Coruzzi G Shasha D (2013) Gene regulatory networks in plants learning causality from time andperturbation Genome Biol 14 123
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumar S Stecher G Suleski M Hedges SB (2017) TimeTree a resource for timelines timetrees and divergence times Mol Biol Evol34 1812-1819
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Kumari S Nie J Chen H-S Ma H Stewart R Li X Lu M-Z Taylor WM Wei H (2012) Evaluation of gene association methods forcoexpression network construction and biological knowledge discovery PLoS One 7 e50411
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Langfelder P Horvath S (2008) WGCNA an R package for weighted correlation network analysis BMC Bioinformatics 9 559Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Leinonen R Sugawara H Shumway M (2010) The sequence read archive Nucleic Acids Res gkq1019Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li C Qiao Z Qi W Wang Q Yuan Y Yang X Tang Y Mei B Lv Y Zhao H et al (2015a) Genome-Wide Characterization of cis-Acting DNATargets Reveals the Transcriptional Regulatory Framework of Opaque2 in Maize Plant Cell 27 532-545
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li H Peng Z Yang X Wang W Fu J Wang J Han Y Chai Y Guo T Yang N (2013a) Genome-wide association study dissects the geneticarchitecture of oil biosynthesis in maize kernels Nat Genet 45 43-50
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li J Wei H Zhao PX (2013b) DeGNServer Deciphering Genome-Scale Gene Networks through High Performance ReverseEngineering Analysis 2013
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li P Piao Y Shon HS Ryu KH (2015b) Comparing the normalization methods for the differential analysis of Illumina high-throughputRNA-Seq data BMC Bioinformatics 16 347 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Q Eichten SR Hermanson PJ Zaunbrecher VM Song J Wendt J Rosenbaum H Madzima TF Sloan AE Huang J et al (2014a)Genetic Perturbation of the Maize Methylome Plant Cell 26 4602-4616
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li S Labaj PP Zumbo P Sykacek P Shi W Shi L Phan J Wu P-Y Wang M Wang C (2014b) Detecting and correcting systematicvariation in large-scale RNA sequencing data Nature Biotechnol 32 888-895
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Li Y Pearl SA Jackson SA (2015c) Gene Networks in Plant Biology Approaches in Reconstruction and Analysis Trends Plant Sci 20664-675
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liao Y Smyth GK Shi W (2014) featureCounts an efficient general purpose program for assigning sequence reads to genomicfeatures Bioinformatics 30 923-930
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lim WK Wang K Lefebvre C Califano A (2007) Comparative analysis of microarray normalization procedures Effects on reverseengineering gene networks Bioinformatics pp 282-288
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Liu S Liu Y Zhao J Cai S Qian H Zuo K Zhao L Zhang L (2017) A computational interactome for prioritizing genes associated withcomplex agronomic traits in rice Plant J 90 177-188
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Lonsdale J Thomas J Salvatore M Phillips R Lo E Shad S Hasz R Walters G Garcia F Young N (2013) The genotype-tissueexpression (GTEx) project Nat Genet 45 580-585
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Love MI Huber W Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biol15 1
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Luo X You Z Zhou M Li S Leung H Xia Y Zhu Q (2015) A highly efficient approach to protein interactome mapping based oncollaborative filtering framework Sci Rep 5 7702
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma C Wang X (2012) Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis PlantPhysiol 160 192-203
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma H-W Zeng A-P (2003) The connectivity structure giant strong component and centrality of metabolic networks Bioinformatics 191423-1430
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ma S Shah S Bohnert HJ Snyder M Dinesh-Kumar SP (2013) Incorporating motif analysis into gene co-expression networks revealsnovel modular expression pattern and new signaling pathways PLoS Genet 9 e1003840
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Marbach D Costello JC Kuumlffner R Vega NNM Prill RJ Camacho DM Allison KR Aderhold A Allison KR Bonneau R et al (2012)Wisdom of crowds for robust gene network inference Nat Methods 9 796-804
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Margolin AA Nemenman I Basso K Wiggins C Stolovitzky G Dalla Favera R Califano A (2006) ARACNE an algorithm for thereconstruction of gene regulatory networks in a mammalian cellular context BMC Bioinformatics 7 S7
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mark Cigan A Unger-Wallace E Haug-Collet K (2005) Transcriptional gene silencing as a tool for uncovering gene function in maizePlant J 43 929-940
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads EMBnet J 17 pp-10Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Maza E Frasse P Senin P Bouzayen M Zouine M (2013) Comparison of normalization methods for differential gene expressionanalysis in RNA-Seq experiments A matter of relative size of studied transcriptomes Commun Integr Biol 6 e25849
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
McClintock B (1983) The Significance of Responses of THE GENOME TO CHALLENGE Science (80- ) 792-801Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Kontos K Lafitte F Bontempi G (2007) Information-theoretic inference of large transcriptional regulatory networks EurasipJ Bioinforma Syst Biol doi 101155200779879
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Meyer PE Lafitte F Bontempi G (2008) minet ARBioconductor package for inferring large transcriptional networks using mutualinformation BMC Bioinformatics 9 461
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Monaco MK Sen TZ Dharmawardhana PD Ren L Schaeffer M Naithani S Amarasinghe V Thomason J Harper L Gardiner J et al(2013) Maize Metabolic Network Construction and Transcriptome Analysis Plant Genome 6 12
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morohashi K Casas MI Falcone Ferreyra ML Falcone Ferreyra L Mejiacutea-Guerra MK Pourcel L Yilmaz A Feller A Carvalho B EmilianiJ et al (2012) A genome-wide regulatory framework identifies maize pericarp color1 controlled genes Plant Cell 24 2745-64
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Morris JH Apeltsin L Newman AM Baumbach J Wittkop T Su G Bader GD Ferrin TE (2011) clusterMaker a multi-algorithmclustering plugin for Cytoscape BMC Bioinformatics 12 436
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mortazavi A Williams BA McCue K Schaeffer L Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq NatMethods 5 621-628
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Mukaka MM (2012) A guide to appropriate use of correlation coefficient in medical research Malawi Med J 24 69-71Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Obayashi T Hayashi S Saeki M Ohta H Kinoshita K (2009) ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Acids Res 37 D987-D991Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ogata Y Suzuki H Sakurai N Shibata D (2010) CoP a database for characterizing co-expressed gene modules with biologicalinformation in plants Bioinformatics 26 1267-1268
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Oshlack A Wakefield MJ (2009) Transcript length bias in RNA-seq data confounds systems biology Biol Direct 4 14Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Park T Yi S-G Kang S-H Lee S Lee Y-S Simon R (2003) Evaluation of normalization methods for microarray data BMC Bioinformatics4 33
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Paulson J Chen C-Y Lopes-Ramos CM Kuijjer ML Platig J Sonawane AR Fagny M Glass K Quackenbush J (2016) Tissue-awareRNA-Seq processing and normalization for heterogeneous and sparse data bioRxiv 81802
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Pautler M Eveland AL LaRue T Yang F Weeks R Lunde C Je B Il Meeley R Komatsu M Vollbrecht E et al (2015) FASCIATED EAR4Encodes a bZIP Transcription Factor That Regulates Shoot Meristem Size in Maize Plant Cell Online 2 tpc114132506
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Penning BW Hunter 3rd CT Tayengwa R Eveland AL Dugard CK Olek AT Vermerris W Koch KE McCarty DR Davis MF et al (2009)Genetic resources for maize cell wall biology Plant Physiol 151 1703-1728
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Ponnala L Wang Y Sun Q Wijk KJ (2014) Correlation of mRNA and protein abundance in the developing maize leaf Plant J 78 424-440Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Rau A Gallopin M Celeux G Jaffreacutezic F (2013) Data-based filtering for replicated high-throughput transcriptome sequencingexperiments Bioinformatics 29 2146-2152
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Robinson MD McCarthy DJ Smyth GK (2010) edgeR a Bioconductor package for differential expression analysis of digital geneexpression data Bioinformatics 26 139-140
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sales G Romualdi C (2011) parmigenemdasha parallel R package for mutual information estimation and gene network reconstructionBioinformatics 27 1876-1877
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sangrador-Vegas A Mitchell AL Chang H-Y Yong S-Y Finn RD (2016) GO annotation in InterPro why stability does not indicateaccuracy in a sea of changing annotations Database J Biol databases curation 2016
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sato Y Antonio BA Namiki N Takehisa H Minami H Kamatsuki K Sugimoto K Shimizu Y Hirochika H Nagamura Y (2011) RiceXPro aplatform for monitoring gene expression in japonica rice grown under natural field conditions Nucleic Acids Res 39 D1141-D1148
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schaefer RJ Briskine R Springer NM Myers CL (2014) Discovering functional modules across diverse maize transcriptomes using wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
COB the co-expression browser PLoS One doi 101371journalpone0099193Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schmidt D (2016) Co-Operation Fast Correlation Covariance and Cosine Similarity R packagePubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schnable P Ware D Fulton R Stein J (2009) The B73 maize genome complexity diversity and dynamics Science (80- ) 326 1112-1115Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Schwanhaumlusser B Busse D Li N Dittmar G Schuchhardt J Wolf J Chen W Selbach M (2011) Global quantification of mammalian geneexpression control Nature 473 337-342
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Scott-Boyer M-P Haibe-Kains B Deschepper CF (2013) Network statistics of genetically-driven gene co-expression modules in mousecrosses Frontiers in Genetics 20134291
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Serin EAR Nijveen H Hilhorst HWM Ligterink W (2016) Learning from Co-expression Networks Possibilities and Challenges FrontPlant Sci 7 444
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Shannon P Markiel A Ozier O Baliga NS Wang JT Ramage D Amin N Schwikowski B Ideker T (2003) Cytoscape a softwareenvironment for integrated models of biomolecular interaction networks Genome Res 13 2498-2504
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Sing T Sander O Beerenwinkel N Lengauer T (2005) ROCR visualizing classifier performance in R Bioinformatics 21 3940-3941Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Singh M Goel S Meeley RB Dantec C Parrinello H Michaud C Leblanc O Grimanelli D (2011) Production of viable gametes withoutmeiosis in maize deficient for an ARGONAUTE protein Plant Cell 23 443-458
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Song L Langfelder P Horvath S (2012) Comparison of co-expression measures mutual information correlation and model basedindices BMC Bioinformatics 13 328
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Stelpflug SC Rajandeep S Vaillancourt B Hirsch CN Buell CR Leon N De Kaeppler SM (2015) An expanded maize gene expressionatlas based on RNA-sequencing and its use to explore root development Plant Genome 314-362
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Szklarczyk D Franceschini A Wyder S Forslund K Heller D Huerta-Cepas J Simonovic M Roth A Santos A Tsafou KP et al (2015)STRING v10 Protein-protein interaction networks integrated over the tree of life Nucleic Acids Res 43 D447-D452
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Tzfadia O Diels T De Meyer S Vandepoele K Aharoni A Van de Peer Y (2016) CoExpNetViz Comparative Co-Expression NetworksConstruction and Visualization Tool Front Plant Sci 6 1194
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Usadel B Obayashi T Mutwil M Giorgi FM Bassel GW Tanimoto M Chow A Steinhauser D Persson S Provart NJ (2009) Co-expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
expression tools for plant biology opportunities for hypothesis generation and caveats Plant Cell Environ 32 1633-51Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
USDA (2016) Grain World Markets and Trade
Vanholme R Demedts B Morreel K Ralph J Boerjan W (2010) Lignin biosynthesis and structure Plant Physiol 153 895-905Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Verdin E Ott M (2014) 50 Years of Protein Acetylation From Gene Regulation To Epigenetics Metabolism and Beyond Nat Rev MolCell Biol 16 258-264
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Walley JW Sartor RC Shen Z Schmitz RJ Wu KJ Urich MA Nery JR Smith LG Schnable JC Ecker JR (2016) Integration of omicnetworks in a developmental atlas of maize Science 353 814-818
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang W Zhou XJ Liu Z Sun F (2015a) Network tuned multiple rank aggregation and applications to gene ranking BMC Bioinformatics16 S6
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang YXR Jiang K Feldman LJ Bickel PJ Huang H (2015b) Inferring gene-gene interactions and functional modules using sparsecanonical correlation analysis Ann Appl Stat 9 300-323
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wang Z Gerstein M Snyder M (2009) RNA-Seq a revolutionary tool for transcriptomics Nat Rev Genet 10 57-63Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wei C Li J Bumgarner RE (2004) Sample size for detecting differentially expressed genes in microarray experiments BMC Genomics5 87
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wolfe CJ Kohane IS Butte AJ (2005) Systematic survey reveals general applicability of guilt-by-association within genecoexpression networks BMC Bioinformatics 6 227
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Wu D-D Wang X Li Y Zeng L Irwin DM Zhang Y-P (2014) Out of Pollen Hypothesis for Origin of New Genes in Flowering PlantsStudy from Arabidopsis thaliana Genome Biol Evol 6 2822-2829
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yang H Liu X Xin M Du J Hu Z Peng H Rossi V Sun Q Ni Z Yao Y (2016) Genome-wide Mapping of Targets of Maize HistoneDeacetylase HDA101 Reveals Its Function and Regulatory Mechanism during Seed Development Plant Cell 28 629-645
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Yim WC Yu Y Song K Jang CS Lee B-M (2013) PLANEX the plant co-expression database BMC Plant Biol 13 83Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zheng W Chung LM Zhao H (2011) Bias detection and correction in RNA-Sequencing data BMC Bioinformatics 12 290Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Allen JD Xiao G Xie Y (2014) Ensemble-based network aggregation improves the accuracy of gene network reconstructionPLoS One 9 e106319 wwwplantphysiolorgon August 22 2020 - Published by Downloaded from
Copyright copy 2017 American Society of Plant Biologists All rights reserved
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhong R Ye ZH (2015) Secondary cell walls Biosynthesis patterned deposition and transcriptional regulation Plant Cell Physiol 56195-214
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zhu G Wu A Xu X-J Xiao P Lu L Liu J Cao Y Chen L Wu J Zhao X-M (2016) PPIM A protein-protein interaction database for MaizePlant Physiol 170 pp1501821-
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
Zyprych-Walczak J Szabelska A Handschuh L Goacuterczak K Klamecka K Figlerowicz M Siatkowski I (2015) The impact of normalizationmethods on RNA-Seq data analysis Biomed Res Int 2015
Pubmed Author and TitleCrossRef Author and TitleGoogle Scholar Author Only Title Only Author and Title
wwwplantphysiolorgon August 22 2020 - Published by Downloaded from Copyright copy 2017 American Society of Plant Biologists All rights reserved
- Parsed Citations
- Article File
- Figure 1
- Figure 2
- Figure 3
- Figure 4
- Figure 5
- Figure 6
- Figure 7
- Figure 8
- Parsed Citations
-