1 / 26
Nomogram Integrating Genomics with Clinicopathological Features Improves
Prognosis Prediction for Colorectal Cancer
Yongfu Xiong1†, Wenxian You
1†, Min Hou2†, Linglong Peng
1, He Zhou
1, Zhongxue
Fu1*
1Department of Gastrointestinal Surgery, The First Affiliated Hospital of Chongqing
Medical University, Chongqing, China.
2Department of Oncology, Affiliated Hospital of North Sichuan Medical College,
Nanchong, China.
†Equal contributor
*Correspondence: Zhongxue Fu, Ph.D. Department of Gastrointestinal Surgery, The
First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, China.
E-mail: [email protected]
Running title
A Nomogram to Predict Colorectal Cancer Prognosis
Keywords: CRC, genomic signature, nomogram, clinicopathological features
Disclosure of potential conflicts of interest
The authors have no potential conflicts of interest to disclose.
Financial support
This work was financially supported by a grant from the National Natural Science
foundation of China (grant no. 81572319; project recipient: ZhongXue Fu).
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
2 / 26
Abstract
The current tumor staging system is insufficient for predicting the outcomes of
patients with colorectal cancer (CRC) because of its phenotypic and genomic
heterogeneity. Integrating gene expression signatures with clinicopathological factors
may yield a predictive accuracy exceeding that of the currently available system.
Twenty-seven signatures that used gene expression data to predict CRC prognosis
were identified and re-analyzed using bioinformatic methods. Next, clinically
annotated CRC samples (n=1710) with the corresponding expression profiles, that
predicted a patient's probability of cancer recurrence, were pooled to evaluate their
prognostic values and establish a clinicopathologic-genomic nomogram. Only 2 of the
27 signatures evaluated showed a significant association with prognosis and provided
a reasonable prediction accuracy in the pooled cohort (HR = 2.46, 95% CI =
1.183-5.132, p < 0.001, AUC = 60.83; HR =2.33, 95% CI = 1.218-4.453, p < 0.001,
AUC = 71.34). By integrating the above signatures with prognostic
clinicopathological features, a clinicopathologic-genomic nomogram was cautiously
constructed. The nomogram successfully stratified CRC patients into three risk groups
with remarkably different DFS rates and further stratified stage II and III patients into
distinct risk subgroups. Importantly, among patients receiving chemotherapy, the
nomogram determined that those in the intermediate- (HR = 0.98, 95% CI =
0.255-0.679, p < 0.001) and high-risk (HR = 0.67, 95% CI = 0.469-0.957, p = 0.028)
groups had favorable responses.
Implications: These findings offer evidence that genomic data provides independent
and complementary prognostic information, and incorporation of this information
refines the prognosis of CRC.
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
3 / 26
Introduction
Colorectal carcinoma (CRC) is the third most commonly diagnosed malignant disease
and the second leading cause of cancer death worldwide (1). Despite advances in
CRC screening, diagnosis, and curative resection, its prognosis is not entirely
satisfactory because optimal management and individual therapy strategies still
present great challenges, as CRC is a well-recognized heterogeneous disease.
Currently, treatment decisions and prognoses for patients with CRC are primarily
driven by assessment of the tumor-node-metastasis (TNM) staging system, which is
based merely on anatomical information (2). Previous studies revealed that patients
with stage I CRC have a 5-year survival rate of approximately 93%, which decreases
to approximately 80% for patients with stage II disease and to 60% for patients with
stage III disease (3). However, discrepancies in the survival outcomes of patients at
the same stage and receiving similar treatments is commonly observed.
More importantly, according to the current TNM stage, adjuvant therapy is
generally recommended for all patients with stage III disease (4). However, patients
with T1-2N1M0 tumors (stage IIIA) have significantly higher survival rates than
those with stage IIB tumors (3), suggesting that stratifying patients at high risk of
recurrence, who are most likely to benefit from adjuvant therapy, is critical. Therefore,
substantial effort has been made to discover new clinicopathological indicators to
optimize the current staging system. To date, some clinicopathological features, such
as emergency presentation, adjacent organ involvement (T4), intestinal perforation or
obstruction, high tumor grade, inadequate sampling of lymph nodes and
lymphatic/vascular invasion, with prognostic value to classify CRC patients who are
at a high risk of recurrence have been applied in clinical practice. However, these
factors are all relatively weak and insufficient to identify CRC patients who may
benefit from adjuvant therapy, which leads to potential under- or over-treatment (5).
CRC biological behavior (e.g., recurrence, metastasis, drug resistance) is a
tightly regulated process that requires the aberrant expression of related genes to
empower carcinoma cells with their corresponding abilities. Accordingly, before
clinicopathologically detectable changes occur, the underlying alterations necessary
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
4 / 26
for recurrence have already occurred in the primary CRC, providing the possibility to
develop robust prognostic tools by using multiple genes in combination (6). During
the last decade, gene signatures have shown great promise in predicting the long-term
outcomes and treatment responses of individual patients (7). Of note, because of the
outstanding ability to predict the prognoses of patients with breast cancer, multigene
assays, such as Oncotype DX and MammaPrint, have been successfully approved by
the US Food and Drug Administration and are available in routine clinical practice (8).
Actually, genomic signatures to predict the prognosis of CRC have also been
continually developed in the past 10 years, but none are commercially used in the
clinic (9). For example, Agesen et al. established a 13-gene expression classifier,
ColoGuideEx, for prognosis predictions specific to patients with stage II CRC (10).
Based on the essential role of lipid metabolism in carcinogenesis, Teodoro et al.
identified a group of 4 genes that predict survival in intermediate-stage colon cancer
patients, allowing the delineation of a high-risk group that may benefit from adjuvant
therapy. In addition, Smith (11), Chen (12), Oh (13) and Popovici et al. (14) also
published their own signatures based on different mechanisms. Despite a large
number of studies, no signatures have remained credible in either meta-analyses or
prospective trials (15). However, these published signatures clearly show low
prediction accuracies but moderate clinical usefulness (15). Moreover, when confined
to a specific CRC stage, promising results regarding risk stratification have also been
reported. Therefore, investigating the predictive ability of these published signatures
on comprehensive large-scale datasets and identifying whether any can be used to
clinically guide treatment decisions is necessary.
Currently, while doubts about the predictive value of clinicopathological features
are increasing, they still provide the most reliable guidelines for the prognostication
and treatment of CRC (16). Thus, we hypothesized that integrating genomic
signatures with clinicopathological features in a model would yield a predictive
accuracy exceeding that of the currently available prognostic system. Nomogram is a
statistical prediction model that combines multiple prognostic factors to make
intuitive graphical and individualized predictions (17). Here, we aimed to apply a
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
5 / 26
systematic approach to evaluate the clinical usefulness of CRC-related signatures and
then construct a composite clinicopathologic-genomic nomogram by integrating
factors with potential prognostic value in a training set. Moreover, using another
independent set, the capacity of the nomogram to stratify CRC patients most likely to
benefit from chemotherapy was further validated.
Materials and Methods
Patients and prognostic signatures
To identify gene expression data arrayed using the Affymetrix platform with clinically
annotated data, we systematically searched Gene Expression Omnibus (GEO,
http://www.ncbi.nlm.nih.gov/geo/), ArrayExpress (http://www.ebi.ac.uk/arrayexpress/)
and related literature with the keywords “colorectal cancer ”, “CRC”, “colon cancer ”,
“survival”, “relapse”, “recurrence”, “prognostic”, “prognosis” and “outcomes”
published through Aug 1, 2017. For some datasets whose clinical data did not
accompany their gene expression profiles, we either searched the supplements or
contacted one or more of the investigators to obtain the missing information.
Moreover, datasets with small sample sizes and duplications were excluded. Raw
microarray data and the corresponding clinical data of these datasets were retrieved
and manually organized when available. Only patients diagnosed with colorectal
cancer having clinicopathological and survival information available were included.
Patients with follow-up or survival times of less than 1 month as well as patients with
missing or insufficient data on age, local invasion, lymph node metastasis, distant
metastasis and TNM stage were excluded from subsequent processing. Eventually, all
patients satisfying the inclusion criteria were combined and summarized in
Supplementary Table S1 and S2.
Expression data processing
For raw CEL files available from Affymetrix microarrays, the data were normalized
and annotated using a MAS5 algorithm and the corresponding annotation files from R
Bioconductor to obtain summarized values for each probeset; otherwise,
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
6 / 26
pre-processed data as provided by the contributors were used. For each sample in
every data set, measurements without a gene annotation were excluded, and multiple
probesets corresponding to a single gene were summarized into a gene symbol by
taking the most variable probeset measured by the interquartile range (IQR).
Identification and analysis of gene signatures potentially related to CRC
prognosis
Gene signatures potentially related to CRC prognosis were systematically retrieved
from PubMed. The search was restricted to recent papers to increase validity (from
January 2004 to August 2017). The selection criteria are detailed in Supplementary
Figure S1. Articles that provided a list of differentially expressed genes in primary
tumor samples associated with CRC prognosis were included in our study. Studies
based on tissue microarray and those that were exclusively focused on differences
between stages or between primary tumors and metastases were excluded. The
signatures finally included in our analysis are described in Table 1 (detailed
descriptions provided in Supplementary Table S3). In addition, the probesets or genes
of those signatures were re-annotated using the SOURCE web tool to address the
retired gene symbols and their differences in the tested platforms. The re-annotated
genes were then subjected to biological function enrichment analysis, and the online
analytical tool DAVID (Database for Annotation, Visualization and Integrated
Discovery) (18) was used to enrich gene ontology (GO) functions and Kyoto
Encyclopedia of Genes and Genomes (KEGG) pathways. GO terms and KEGG
pathways with significant enrichment false discovery rate (FDR) values less than 0.05
were selected for further analysis. In addition, genes from the abovementioned
signatures were mapped and imported into the Retrieval of Interacting Genes/Proteins
(STRING) 9.1 database, which queried the human protein-protein interaction network
for interactions between effective linkers and seeds to construct a functional
subnetwork. Cluster analyses were performed using correlation distance metrics and
the average linkage agglomeration algorithm (R package nclust version 1.9.0; ref. 34).
Non-negative matrix factorization (NMF) was done with the NMF package (version
0.20.5) and standard strategies.
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
7 / 26
Subclass prediction of CRC patients in the training set
The re-preprocessed microarray dataset, which represents the genomic features of the
individual, were classified with the prognostic signatures identified above using the
nearest template prediction (NTP) method (19) as implemented in Gene Pattern
software (Broad Institute of Harvard and MIT, Boston, MA) (20). NTP requires only a
list of pre-specified template signature genes and a dataset to be tested and not a
corresponding training dataset to capture good and poor gene expression patterns in
each sample. Briefly, a template containing representative expression patterns of the
signature genes was defined based on published gene signatures from their respective
studies. The proximity of the signature gene expression patterns of the sample to the
template was evaluated by calculating the cosine distance. The FDR was used to
correct the p-values for multiple hypothesis testing, and prediction analysis was
performed separately for each dataset. A prediction of good or poor for the related
signature was determined based on FDR predictions <0.05, and the remaining
samples with intermediate expression levels of the correlated genes in the signature
were classified as uncertain.
Concordance among these predictions was evaluated using unsupervised
clustering according to the Cramer’s V coefficient of the paired prediction overlap as
previously described (21). Cramer’s V statistic values range from 0 to 1, with values
between 0.36 and 0.49 indicating substantial correlation, and values higher than 0.5
indicating strong correlation.
Development, comparison, and validation of the prognostic models
All the patients included in this study were randomly separated into training and
validation sets. Disease-free survival (DFS) was defined as the interval between the
day that surgery was performed and the day that recurrence was first detected. If
recurrence was not diagnosed, the date of death due to CRC or last follow-up was
used. Overall survival (OS) was calculated from the date of diagnosis or surgery to
the date of death or last follow-up. Continuous variables were expressed as median
(IQR) or median (range) values, and group comparisons were performed by the
Student's t test or the Wilcoxon rank sum test. Categorical variables were expressed as
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
8 / 26
percentages, and group comparisons were performed by Pearson’s χ2 test or the
Fisher’s exact test. Median follow-up was calculated using the reverse Kaplan-Meier
method (22). In the training set, survival curves for different variable values were
generated using Kaplan-Meier estimates and compared using the log-rank test.
Variables that achieved significance at p values less than 0.05 were subjected to
multivariate analyses via the Cox regression model. Then, statistical analyses to
identify independent prognostic factors were conducted. Based on the multivariate
analysis results, a prognostic nomogram was established using a backward stepwise
Cox proportional hazard model. The entire population was divided into three risk
groups (high, intermediate, low) according to the tertiles of the total scores given by
the established nomogram in the training set. Concordance index (c-index) values
were used to measure the discrimination performance. Finally, for external validation,
the total scores of each patient in the validation set were calculated according to the
proposed nomogram to verify its generalization. Three risk groups were determined
by the tertiles defined in the training set, and the respective Kaplan-Meier survival
curves were delineated. All statistical tests performed were two-sided, and p values
less than 0.05 were considered statistically significant.
Results
Screening and analysis of eligible samples and published signatures
To comprehensively evaluate their clinical usefulness, we retrieved all of the
published signatures. Accordingly, 81 signatures published as valid prognostic tools in
CRC were obtained from 507 potentially related articles. Among these signatures,
those with genes not clearly described in their respective studies or investigating the
prognostic value base on tissue array, immune cells, circulation blood or RT-PCR
results were further excluded (see flowchart in Supplementary Figure S1 for details).
Finally, a total of 27 signatures from 26 studies met all of the inclusion criteria and
were retained for subsequent analysis (Supplementary Table S3). Among these
signatures, almost all were obtained from experiments based on CRC tissues except
for one, which was derived from an animal experiment (23). The 27 signatures
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
9 / 26
contained 1274 total genes, among which 1041 were unique, and the signature sizes
ranged from 4 to 264 (Figure 1A). The top overlapping genes in the above signatures
were FN1 (ten times), CYP1B1 and POSTN (six times), 7 genes (five times), 7 genes
(four times), 29 genes (three times), and 107 genes (twice). As is clearly shown in
Figure 1A, none of these genes appeared in all signatures. In addition, some repeated
genes in different signatures even presented opposite prognostic values. This
lower-than-expected overlap, at least in part, accounted for the low reproducibility of
the signatures and the failure in clinical practice.
Accumulating evidence strongly suggests that the prognosis of CRC patients,
such as recurrence and metastasis, is tightly regulated by aberrantly expressed genes
via specific biological processes (16). To gain insight into the underlying biological
meanings of the gene set obtained from the above signatures, enrichment (GO and
KEGG) and protein-protein interaction (PPI) analyses were conducted
(Supplementary Table S4). Figure 1B-D shows that specific GO categories closely
related to CRC prognosis, such as cell cycle (FDR<0.0001), regulation of cell
proliferation (FDR<0.0001), extracellular matrix (FDR<0.0001), and chemokine
activity (FDR<0.0001), were significantly enriched. Additionally, some KEGG
pathways well known to be involved in CRC metastasis and recurrence were enriched,
including colorectal cancer metastasis signaling (FDR<0.0001), wnt-β catenin
signaling (FDR = 0.0002), VEGF signaling (FDR = 0.0008), and chemokine signaling
(FDR = 0.0021) (Figure 1E). More importantly, a visualized network of these gene
interactions revealed that the abovementioned pathways not only form highly
integrated modules but also play essential roles in the entire network (Figure 1F).
These findings suggested that notwithstanding the low reproducibility in the system
test and clinical validation, genes obtained from CRC signatures actually have
potential prognostic value.
Furthermore, to assess the global performance of these signature and construct a
composite nomogram, publicly available datasets with whole-genome profiles and
corresponding clinical information in CRC patients were screened and downloaded
(see flowchart in Supplementary Figure S1 for details). Ultimately, 1710 CRC
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
10 / 26
patients pertaining to 9 cohorts (Supplementary Table S5) met the set criteria and
were retained. Although the profiles were obtained using different technologies and
based on fresh or formalin-fixed and paraffin-embedded (FFPE) specimens as source
material, they showed no evidence of cohort-bias clustering by principal component
analysis (PCA, Supplementary Figure S2A-C) or hierarchical clustering
(Supplementary Figure S2D), suggesting that patients from different cohorts in the
present study could be blended together. Moreover, to balance the baseline
characteristics, all eligible patients were randomly divided into a training set (n = 855)
and a validation set (n = 855). The demographic and major clinicopathological
characteristics of the patients are summarized in Supplementary Table S2. As we
expected, no significant differences were observed between the statistical properties
of the training and validation sets.
Global prognostic performance of the published signatures
To determine the correlations of signatures and CRC patient outcomes, the prognostic
performances of the 27 signatures were evaluated in the training set using a modified
NTP method as previously described (19). Among the 27 signatures evaluated, all
except one (Teodoro’s (24) signature had the only poor prognostic gene) were able to
confidently stratify patients (FDR <0.05) into good and poor subgroups. Table 1 and
Figure 2A summarize the prediction results obtained for each of the 855 patients.
Popovici’s signature [34] was the most prevalent prediction in the training set (83.6%;
615 of 855), whereas Agesen’s signature [42] was securely identified in only 19.9%
(167 of 855) of CRC patients. Of note, conflicting prognostic outcomes were
commonly observed among the training set, and only 34% patients had consistent
results (good or poor prognosis) in more than 5 signatures. However, as demonstrated
in Figure 2A, the overall tendency of the NTP analysis was consistent and roughly
consistent with previous studies (25). In addition, global Cramer’s V coefficients were
calculated for each signature to gain insight into their concordances. Unsupervised
clustering based on these coefficients indicated a substantial association among these
signatures (Figure 2B). Obviously, 5 signatures obtained from the studies of Oh (13),
Minh (11), Chang (9), Jiang (26), and Fritzmann et al. (27) were at least moderately
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
11 / 26
correlated with each other. Moreover, another 4 signatures (Toshiaki’s, Aziz’s, Chen’s,
Xu’s) were also clustered with similar coefficients. These findings are largely
consistent with the initial purpose of each signature, which further confirms the
potential prognostic value of these signatures.
We then applied Cox regression analysis to assess whether any signatures were
statistically significant for CRC-related recurrence/progression and independent of
clinical features in the training set. The univariate and clinical factor-adjusted
multivariate analysis results are displayed in Supplementary Table S6 and Figure 2C.
Univariate analyses revealed that age (HR = 1.12, 95% CI = 1.021-1.269, p = 0.048),
local invasion (T stage; HR = 2.21, 95% CI = 1.139-7.029, p = 0.027), lymph node
metastasis (N stage; HR = 2.28, 95% CI = 1.189-3.784, p =0.021), distant metastasis
(M stage; HR = 2.46, 95% CI = 1.363-4.453, p = 0.003), and TNM stage (HR = 2.15,
95% CI = 1.165-3.971, p = 0.014) were confidently associated with poor prognosis.
However, with respect to genomic factors, only 8 signatures were validated to have
significant effects on DFS (Figure 2C). Next, we used multivariate Cox analysis with
the covariates, including age, local invasion, lymph node metastasis, distant
metastasis and TNM stage, to evaluate the predictive values of the 8 signatures. As
shown in Figure 4C, after each covariate was adjusted, Popovici’s signature (HR =
1.92, 95% CI = 1.298-2.841, p = 0.001) and Fritzmann’s signature (HR = 2.16, 95%
CI = 1.144-4.069, p < 0.0001) remained powerful enough to indicate prognosis in the
training set for DFS, which revealed that these two signatures are independent
prognostic factors.
Construction and comparison analysis of the composite nomogram
Accurately predicting DFS in patients with CRC is important for counseling,
follow-up planning, and selection of appropriate adjuvant therapy. Thus, to provide
the clinician with a quantitative and user-friendly method to generate individualized
predictions, we constructed a nomogram that integrated the genomic signatures
identified above with clinicopathological features (Figure 3A). Calibration plots
revealed excellent agreement between the nomogram-predicted probabilities and the
actual observations of 1-, 3- and 5-year DFS (Figure 3B). The nomogram
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
12 / 26
demonstrated that distant metastasis made the largest contribution to prognosis,
followed by the TNM stage and lymph node metastasis. Age and Popovici’s signature
had moderate effects on survival rate. Each category within these variables was
assigned a point on the top scale based on the coefficients from Cox regression. By
summing all of the assigned points for the seven variables and drawing a vertical line
between total points and the survival probability axis, we easily obtained the
estimated probabilities of 1-, 3- and 5-year DFS. The risk score cutoff values (≤10.6,
10.6-21.2 and ≥21.2) were selected in terms of total points to stratify patients into
roughly equal tertiles in the training set, which accurately divided patients into low-
(reference), intermediate- (HR = 2.227, 95% CI = 1.507-3.291, p < 0.001), and
high-risk (HR = 6.787, 95% CI = 4.786-9.624, p < 0.001) subgroups with
significantly different DFS rates (Figure 3C, D).
Although the TNM stage in combination with other clinical factors is a
well-recognized prediction system for CRC prognosis (28), its effectiveness urgently
needs to be increased. To determine whether the genomic signatures added additional
prognostic value to the current system, time-dependent receiver operating
characteristic (ROC) analysis was applied to compare the performances between the
nomogram, clinicopathological factors and genomic signatures (Figure 3E).
Expectedly, the nomogram achieved the greatest area under the ROC curve (AUCs at
5-year DFS = 78.07, 71.40, 71.34, and 60.83, p < 0.05), suggesting that the integrated
clinicopathological genomic nomogram had a prognostic performance superior to
those of clinicopathological and genomic information by themselves.
Validating the nomogram for stratifying CRC patient risks
To validate the correlation between the nomogram score (total points) and DFS
statuses of patients with CRC, Kaplan-Meier analysis and log-rank tests based on the
same cutoff value were conducted on the validation set. As shown in Figure 4A and B,
applying the clinicopathologic-genomic nomogram stratified patients into three
distinct risk subgroups with significantly different DFS rates. In addition, the
prognostic accuracy of the nomogram was remarkably better than those of both
clinicopathological and genomic information by themselves (Figure 4C), which was
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
13 / 26
consistent with the results obtained from the training set. Disagreements on
potentially appropriate candidates for adjuvant therapy prevail, especially for patients
belonging to AJCC stages II and III. In the present study, the survival rates predicted
by the nomogram were significantly distinct between the Kaplan-Meier curves
(Figure 4D, E) in patients categorized by the abovementioned major
clinicopathological features (AJCC stages II and III).
Moreover, to ensure the effectiveness of statistical power, patients with available
chemotherapy information were integrated together regardless of the set to which they
belonged (training set or validation set). Intriguingly, we noted that adjuvant
chemotherapy did not enhance OS in all 869 patients (HR = 0.98, 95% CI =
0.769-1.258, p = 0.873; Figure 5A) or in patients who were defined as low risk by the
nomogram (HR = 0.93, 95% CI = 0.515-1.695, p = 0.824; Figure 8B). However,
patients in the nomogram defined as intermediate (HR = 0.98, 95% CI = 0.255-0.679,
p < 0.001; Figure 5C) or high risk (HR = 0.67, 95% CI = 0.469-0.957, p = 0.028;
Figure 5D) had favorable responses to adjuvant chemotherapy. Taken together, these
results suggested that the clinicopathologic-genomic nomogram has the
discriminatory power to identify CRC patients who are suitable candidates for
adjuvant chemotherapy.
Discussion
Although surgery remains the mainstay of curative treatment, the prognosis of
patients with CRC, especially for long-term outcome, depends more on pre- or
postoperative individualized therapies (29), including various strategies of
chemotherapy and chemoradiation. However, as a well-recognized heterogeneous
cancer, its prognosis varies significantly, and optimal management presents challenges.
Currently, adjuvant chemotherapy is regarded as standard treatment for patients with
stage III colon cancer (4). However, patients with T1-2N1M0 tumors (stage IIIA)
have significantly higher survival rates than those with stage IIB tumors (3).
Moreover, for some patients, long-term survival continues to be jeopardized by the
high risk of recurrence even after chemotherapy, and personalized therapy needs to
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
14 / 26
further optimized. With respect to stage II patients, disagreement regarding which
patients are potentially appropriate candidates for adjuvant therapy prevails. The
MOSAIC trial (30) has shown only a 20% benefit from adjuvant therapy at 3 years for
patients with stage II disease and a 28% benefit for patients with recognized clinical
risk factors. Thus, for approximately 75% of stage II patients who receive curative
surgery, the administration of adjuvant chemotherapy is unnecessary and harmful.
Unfortunately, low-risk stage II patients have not been accurately defined. In addition,
patients with microsatellite instability (MSI CRC) have a better prognosis (31) and do
not benefit from 5-fluorouracil (5-FU)-based adjuvant chemotherapy (32), similar to
patients with CpG island methylation (33). All the abovementioned studies highlight
that the rational application of therapeutic strategies requires accurate prognosis
prediction and risk stratification for CRC patients.
Considering the inherent deficiency of TNM stages, substantial effort has been
placed on their optimization (2), and they have been continuously developed over the
past decade. Substantial changes in CRC classifications, based mainly on the analysis
of surveillance, epidemiology, and end results (SEER) data, were made in the seventh
edition (34). However, a comparative study demonstrated that the seventh TNM
edition did not provide a greater accuracy for predicting CRC patient prognoses but
rather resulted in a more complex classification for daily clinical use (35). The eighth
edition of the CRC staging system, containing major advances, including the
introduction of molecular markers, such as MSI, KRAS and BRAF mutations, to
strengthen its discriminatory power, will be implemented worldwide on January 1,
2018 (2). Nonetheless, the forthcoming edition may not provide substantial
improvement because the prognostic value of clinicopathological factors has peaked.
Moreover, the current models still lack genomic information, which may directly
reflect the heterogeneity of CRC and present substantial potential prognostic value.
Genomic signatures predicting the prognosis of CRC have been continually
developed in the past 10 years. In the present study, we retrieved 81 articles related to
the development of signatures for predicting CRC prognosis, including OS, DFS,
disease-specific survival (DSS) and various types of recurrence. For instance, based
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
15 / 26
on the association between the aberrantly expressed gene and the duration of
individual DFS, Zhang et al. (36) constructed a six RNA-based classifier, which was
further validated to have reliable predictive power for detecting recurrence in patients
with stage II colon cancer. Similarly, Anna et al. (23) developed a genomic profile
specific for adult intestinal stem cells (ISC), which was highly enriched in genes that
positively predicted the risk of recurrence. Patients bearing primary CRCs with high
average expression levels of ISC genes had a relative risk of relapse that was 10-fold
higher than that for patients with low levels (23). Moreover, recent evidence (37)
suggest that gene signatures that closely reflect specific biological processes or
oncogenic pathway statuses also have potential prognostic values for stratifying CRC
patients. Even though almost all of these signatures were proven to have prognostic
value in their respective publications, none are routinely used in clinical practice.
Potential explanations for the unsatisfactory results must be considered. Thus, in
the current study, we exhaustively reviewed and analyzed the published multi-gene
signatures in CRC. To eliminate the potential confounding effects, only data derived
from the Affymetrix human genome platform having clearly described genes were
included. Of the 81 published signatures, 27 signatures from 26 publications met all
of the inclusion criteria and were retained (Table 1 and Supplementary Table S3). The
27 signatures contained 1274 total genes, among which 1041 were unique, and the
signature sizes ranged from 4 to 264 (Figure 1A). The top overlapping genes in the
above signatures were FN1 (ten times), CYP1B1 and POSTN (six times), 7 genes
(five times), 7 genes (four times), 29 genes (three times), and 107 genes (twice). As is
clearly shown in Figure 2A, none of these genes appeared in all signatures. These
results were expected since they were previously reported (15) and in agreement with
the systematic analyses of lung cancer and breast cancer prognostic signatures (38).
Obviously, the slight overlap of these signatures in gene identity was perplexing but
sufficiently explained the low reproducibility. Moreover, genetic heterogeneity was
recently shown to exist not only between but also within tumors, meaning that
biological statuses, including oncogenic and carcinostatic states, might regularly vary
in the same patient (39). Our NTP analysis revealed that some patients concurrently
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
16 / 26
harbored more than one signature, which further confirmed the abovementioned
results and indicated that their tumors were genomic instability at the individual level.
Taken together, the different signatures might reflect diverse biological processes and
be active in different or identical individual tumors, which partially accounts for the
wide nonoverlapping among signatures constructed in different CRC studies.
In terms of global performance, in the training set, 8 of 27 signatures showed a
significant association with prognosis and could reasonably predict the outcome.
However, after multivariate adjustment, only 2 signatures remained powerful enough
to indicate prognosis for DFS, which directly demonstrated the low reproducibility of
CRC genomic signatures. More importantly, Popovici’s signature, Fritzmann’s
signature and TNM stage had remarkable prognostic abilities and were independent of
each other, indicating that these signatures may be used to refine the current
prognostic model and facilitate further stratification of patients with CRC at the same
TNM stage.
Our primary goal was to construct a nomogram that could integrate genomic
signatures with clinicopathological variables in a large cohort of CRC patients to add
additional prognostic value to the current staging system. Our multivariate analysis in
the training set revealed that age, local invasion, lymph node metastasis, distant
metastasis, TNM stage, Popovici’s signature and Fritzmann’s signature were
independent prognostic factors (Supplementary Table S6), which was highly
consistent with previous studies on CRC risk factors (40,41). Accordingly, we
cautiously built a clinicopathologic-genomic nomogram using the abovementioned
significant factors. The AUC of the nomogram showed a superior prediction accuracy
compared to that of the combined clinical factors in the training and validation sets
(Figure 3E, 4C). Additionally, the calibration plots showed optimal agreement
between the expected and observed survival probabilities (Figure 3B), ensuring the
reliability of our nomogram. Moreover, the nomogram successfully stratified CRC
patients into three risk groups with remarkably different DFS values based on total
points, which was further confirmed using the same cutoff value in the validation set.
Intriguingly, the TNM staging system is the most important tool for guiding
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
17 / 26
treatment in clinical practice. However, disagreement prevails on potentially
appropriate candidates for adjuvant therapy, especially for stage II and III CRC
patients. Therefore, the identification of a high-risk subgroup among these patients is
of urgent clinical needed. Our findings suggest that the clinicopathologic-genomic
nomogram could be a promising tool to stratify stage II and III patients into distinct
risk subgroup and guide individualized therapy. Moreover, in the current study, we
clearly revealed that chemotherapy provides a survival benefit to patients classified as
intermediate- and high-risk by the nomogram (Figure 5).
Despite the promising results, there were several shortcomings in this work. First,
the NTP method uses only a list of signature genes to make class predictions in each
patient’s expression data, allowing the method to be less sensitive to differences in
experimental and analytical conditions and applicable to every patient without
optimization of the analysis parameters. In the current study, we applied the NTP
method for all prediction models, which generally consisted of two major components:
a gene signature and a prediction algorithm. Obviously, the intrinsic limitation of the
NTP method is that it inevitably causes the algorithm lose its prognostic value.
Second, the forthcoming 8th edition of the CRC TNM staging system may better
predict prognosis, but data used in the present study was from the current 7th edition
or the 6th edition. Third, the prognostic signatures of CRC included in our study were
not exhaustive, and some promising signatures were excluded because of insufficient
information. In addition, some clinicopathological factors, even those well recognized
to have prognostic value, were not included in the composite nomogram because of
the low availability of information.
In summary, based on analyzing a large-scale CRC cohort and published
multi-gene prognostic signatures, we confirmed herein for the first time that genomic
information in combination with clinical data can improve patient prognostic
evaluations and should be considered as an independent and complementary approach
to the current clinicopathological prognostic model. Furthermore, considering the
wide promise of our nomogram, which integrates genomic signatures with
clinicopathological features to improve the prognosis prediction of colorectal cancer,
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
18 / 26
further systematic research is needed.
Acknowledgments
Conception and design: Z. Fu and Y. Xiong. Analysis and interpretation of data (e.g.,
statistical analysis, biostatistics, computational analysis): Y. Xiong, W. You, and M.
Hou. Writing, review, and/or revision of the manuscript: L. Peng, H. Zhou.
Administrative, technical, and/or material support (i.e., reporting or organizing data):
Y. Xiong, W. You, M. Hou, L. Peng, and H. Zhou. Study supervision: Z. Fu.
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
19 / 26
References
1. Favoriti P, Carbone G, Greco M, Pirozzi F, Pirozzi RE, Corcione F. Worldwide burden of
colorectal cancer: a review. Updates in surgery 2016;68(1):7-11 doi
10.1007/s13304-016-0359-y.
2. Amin MB, Greene FL, Edge SB, Compton CC, Gershenwald JE, Brookland RK, et al. The Eighth
Edition AJCC Cancer Staging Manual: Continuing to build a bridge from a population-based to
a more "personalized" approach to cancer staging. CA Cancer J Clin 2017;67(2):93-9 doi
10.3322/caac.21388.
3. O'Connell JB, Maggard MA, Ko CY. Colon cancer survival rates with the new American Joint
Committee on Cancer sixth edition staging. Journal of the National Cancer Institute
2004;96(19):1420-5 doi 10.1093/jnci/djh275.
4. Mahar AL, Compton C, Halabi S, Hess KR, Weiser MR, Groome PA. Personalizing prognosis in
colorectal cancer: A systematic review of the quality and nature of clinical prognostic tools for
survival outcomes. Journal of surgical oncology 2017 doi 10.1002/jso.24774.
5. Dotan E, Cohen SJ. Challenges in the management of stage II colon cancer. Seminars in
oncology 2011;38(4):511-20 doi 10.1053/j.seminoncol.2011.05.005.
6. Stintzing S, Stremitzer S, Sebio A, Lenz HJ. Predictive and prognostic markers in the treatment
of metastatic colorectal cancer (mCRC): personalized medicine at work.
Hematology/oncology clinics of North America 2015;29(1):43-60 doi
10.1016/j.hoc.2014.09.009.
7. Lu AT, Salpeter SR, Reeve AE, Eschrich S, Johnston PG, Barrier AJ, et al. Gene expression
profiles as predictors of poor outcomes in stage II colorectal cancer: A systematic review and
meta-analysis. Clin Colorectal Cancer 2009;8(4):207-14 doi 10.3816/CCC.2009.n.035.
8. Xin L, Liu Y-H, Martin TA, Jiang WG. The Era of Multigene Panels Comes? The Clinical Utility of
Oncotype DX and MammaPrint. World Journal of Oncology 2017;8(2):34-40 doi
10.14740/wjon1019w.
9. Chang W, Gao X, Han Y, Du Y, Liu Q, Wang L, et al. Gene expression profiling-derived
immunohistochemistry signature with high prognostic value in colorectal carcinoma. Gut
2014;63(9):1457-67 doi 10.1136/gutjnl-2013-305475.
10. Agesen TH, Sveen A, Merok MA, Lind GE, Nesbakken A, Skotheim RI, et al. ColoGuideEx: a
robust gene classifier specific for stage II colorectal cancer prognosis. Gut 2012;61(11):1560-7
doi 10.1136/gutjnl-2011-301179.
11. Smith JJ, Deane NG, Wu F, Merchant NB, Zhang B, Jiang A, et al. Experimentally derived
metastasis gene expression profile predicts recurrence and death in patients with colon
cancer. Gastroenterology 2010;138(3):958-68 doi 10.1053/j.gastro.2009.11.005.
12. Chen H, Sun X, Ge W, Qian Y, Bai R, Zheng S. A seven-gene signature predicts overall survival
of patients with colorectal cancer. Oncotarget 2016 doi 10.18632/oncotarget.10982.
13. Oh SC, Park Y-Y, Park ES, Lim JY, Kim SM, Kim S-B, et al. Prognostic gene expression signature
associated with two molecularly distinct subtypes of colorectal cancer. Gut 2012;61(9):1291-8
doi 10.1136/gutjnl-2011-300812.
14. Popovici V, Budinska E, Tejpar S, Weinrich S, Estrella H, Hodgson G, et al. Identification of a
poor-prognosis BRAF-mutant-like population of patients with colon cancer. Journal of clinical
oncology : official journal of the American Society of Clinical Oncology 2012;30(12):1288-95
doi 10.1200/JCO.2011.39.5814.
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
20 / 26
15. Phillips WA, Sanz-Pamplona R, Berenguer A, Cordero D, Riccadonna S, Solé X, et al. Clinical
Value of Prognosis Gene Expression Signatures in Colorectal Cancer: A Systematic Review.
PloS one 2012;7(11):e48877 doi 10.1371/journal.pone.0048877.
16. Zhang ZY, Gao W, Luo QF, Yin XW, Basnet S, Dai ZL, et al. A nomogram improves AJCC stages
for colorectal cancers by introducing CEA, modified lymph node ratio and negative lymph
node count. Scientific reports 2016;6:39028 doi 10.1038/srep39028.
17. Iasonos A, Schrag D, Raj GV, Panageas KS. How to build and interpret a nomogram for cancer
prognosis. Journal of clinical oncology : official journal of the American Society of Clinical
Oncology 2008;26(8):1364-70 doi 10.1200/JCO.2007.12.9791.
18. Dennis G, Jr., Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, et al. DAVID: Database for
Annotation, Visualization, and Integrated Discovery. Genome biology 2003;4(5):P3.
19. Hoshida Y. Nearest template prediction: a single-sample-based flexible class prediction with
confidence assessment. PloS one 2010;5(11):e15543 doi 10.1371/journal.pone.0015543.
20. Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP. GenePattern 2.0. Nature genetics
2006;38(5):500-1 doi
http://www.nature.com/ng/journal/v38/n5/suppinfo/ng0506-500_S1.html.
21. Fan C, Oh D, Wessels L, Weigelt B, Nuyten D, Nobel A, et al. Concordance among
gene-expression-based predictors for breast cancer. N Engl J Med 2006;355(6):560-9.
22. Guyot P, Ades AE, Ouwens MJ, Welton NJ. Enhanced secondary analysis of survival data:
reconstructing the data from published Kaplan-Meier survival curves. BMC medical research
methodology 2012;12:9 doi 10.1186/1471-2288-12-9.
23. Merlos-Suarez A, Barriga FM, Jung P, Iglesias M, Cespedes MV, Rossell D, et al. The intestinal
stem cell signature identifies colorectal cancer stem cells and predicts disease relapse. Cell
Stem Cell 2011;8(5):511-24 doi 10.1016/j.stem.2011.02.020.
24. ColoLipidGene: signature of lipid metabolism-related genes to predict prognosis in stage-II
colon cancer patients.
25. Wu J, Zhou L, Huang L, Gu J, Li S, Liu B, et al. Nomogram integrating gene expression
signatures with clinicopathological features to predict survival in operable NSCLC: a pooled
analysis of 2164 patients. Journal of experimental & clinical cancer research : CR 2017;36(1):4
doi 10.1186/s13046-016-0477-x.
26. Jiang Y, Casey G, Lavery IC, Zhang Y, Talantov D, Martin-McGreevy M, et al. Development of a
clinically feasible molecular assay to predict recurrence of stage II colon cancer. J Mol Diagn
2008;10(4):346-54 doi 10.2353/jmoldx.2008.080011.
27. Fritzmann J, Morkel M, Besser D, Budczies J, Kosel F, Brembeck FH, et al. A Colorectal Cancer
Expression Profile That Includes Transforming Growth Factor β Inhibitor BAMBI Predicts
Metastatic Potential. Gastroenterology 2009;137(1):165-75 doi 10.1053/j.gastro.2009.03.041.
28. Kazem MA, Khan AU, Selvasekar CR. Validation of nomogram for disease free survival for
colon cancer in UK population: A prospective cohort study. Int J Surg 2016;27:58-65 doi
10.1016/j.ijsu.2015.12.069.
29. Adam R, Delvart V, Pascal G, Valeanu A, Castaing D, Azoulay D, et al. Rescue surgery for
unresectable colorectal liver metastases downstaged by chemotherapy: a model to predict
long-term survival. Annals of surgery 2004;240(4):644-57; discussion 57-8.
30. André T, Boni C, Mounedji-Boudiaf L, Navarro M, Tabernero J, Hickish T, et al.
Oxaliplatin, Fluorouracil, and Leucovorin as Adjuvant Treatment for Colon Cancer. New
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
21 / 26
England Journal of Medicine 2004;350(23):2343-51 doi 10.1056/NEJMoa032709.
31. Des Guetz G, Schischmanoff O, Nicolas P, Perret GY, Morere JF, Uzzan B. Does microsatellite
instability predict the efficacy of adjuvant chemotherapy in colorectal cancer? A systematic
review with meta-analysis. Eur J Cancer 2009;45(10):1890-6 doi 10.1016/j.ejca.2009.04.018.
32. Vilar E, Gruber SB. Microsatellite instability in colorectal cancer-the stable evidence. Nature
reviews Clinical oncology 2010;7(3):153-62 doi 10.1038/nrclinonc.2009.237.
33. Jover R, Nguyen TP, Perez-Carbonell L, Zapater P, Paya A, Alenda C, et al. 5-Fluorouracil
adjuvant chemotherapy does not increase survival in patients with CpG island methylator
phenotype colorectal cancer. Gastroenterology 2011;140(4):1174-81 doi
10.1053/j.gastro.2010.12.035.
34. Gunderson LL, Jessup JM, Sargent DJ, Greene FL, Stewart A. Revised tumor and node
categorization for rectal cancer based on surveillance, epidemiology, and end results and
rectal pooled analysis outcomes. Journal of clinical oncology : official journal of the American
Society of Clinical Oncology 2010;28(2):256-63 doi 10.1200/jco.2009.23.9194.
35. Nitsche U, Maak M, Schuster T, Kunzli B, Langer R, Slotta-Huspenina J, et al. Prediction of
prognosis is not improved by the seventh and latest edition of the TNM classification for
colorectal cancer in a single-center collective. Annals of surgery 2011;254(5):793-800;
discussion -1 doi 10.1097/SLA.0b013e3182369101.
36. Zhang J-X, Song W, Chen Z-H, Wei J-H, Liao Y-J, Lei J, et al. Prognostic and predictive value of a
microRNA signature in stage II colon cancer: a microRNA expression analysis. The Lancet
Oncology 2013;14(13):1295-306 doi 10.1016/s1470-2045(13)70491-1.
37. de Sousa EMF, Colak S, Buikhuisen J, Koster J, Cameron K, de Jong JH, et al. Methylation of
cancer-stem-cell-associated Wnt target genes predicts poor prognosis in colorectal cancer
patients. Cell Stem Cell 2011;9(5):476-85 doi 10.1016/j.stem.2011.10.008.
38. Arpino G, Generali D, Sapino A, Del Mastro L, Frassoldati A, de Laurentis M, et al. Gene
expression profiling in breast cancer: a clinical perspective. Breast 2013;22(2):109-20 doi
10.1016/j.breast.2013.01.016.
39. Marusyk A, Almendro V, Polyak K. Intra-tumour heterogeneity: a looking glass for cancer?
Nature reviews Cancer 2012;12(5):323-34 doi 10.1038/nrc3261.
40. Zhang ZY, Luo QF, Yin XW, Dai ZL, Basnet S, Ge HY. Nomograms to predict survival after
colorectal cancer resection without preoperative therapy. BMC cancer 2016;16(1):658 doi
10.1186/s12885-016-2684-4.
41. Kawai K, Sunami E, Yamaguchi H, Ishihara S, Kazama S, Nozawa H, et al. Nomograms for
colorectal cancer: A systematic review. World journal of gastroenterology
2015;21(41):11877-86 doi 10.3748/wjg.v21.i41.11877.
42. Abdul Aziz NA, Mokhtar NM, Harun R, Mollah MMH, Mohamed Rose I, Sagap I, et al. A
19-Gene expression signature as a predictor of survival in colorectal cancer. BMC Medical
Genomics 2016;9(1) doi 10.1186/s12920-016-0218-1.
43. Shi M, He J. ColoFinder: a prognostic 9-gene signature improves prognosis for 871 stage II
and III colorectal cancer patients. PeerJ 2016;4:e1804 doi 10.7717/peerj.1804.
44. Xu G, Zhang M, Zhu H, Xu J. A 15-gene signature for prediction of colon cancer recurrence
and prognosis based on SVM. Gene 2017;604:33-40 doi 10.1016/j.gene.2016.12.016.
45. Nguyen MN, Choi TG, Nguyen DT, Kim JH, Jo YH, Shahid M, et al. CRC-113 gene expression
signature for predicting prognosis in patients with colorectal cancer. Oncotarget
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
22 / 26
2015;6(31):31674-92 doi 10.18632/oncotarget.5183.
46. Wang L, Shen X, Wang Z, Xiao X, Wei P, Wang Q, et al. A molecular signature for the
prediction of recurrence in colorectal cancer. Molecular cancer 2015;14:22 doi
10.1186/s12943-015-0296-2.
47. Vargas T, Moreno-Rubio J, Herranz J, Cejas P, Molina S, Gonzalez-Vallinas M, et al.
ColoLipidGene: signature of lipid metabolism-related genes to predict prognosis in stage-II
colon cancer patients. Oncotarget 2015;6(9):7348-63 doi 10.18632/oncotarget.3130.
48. Sveen A, Agesen TH, Nesbakken A, Meling GI, Rognum TO, Liestol K, et al. ColoGuidePro: a
prognostic 7-gene expression signature for stage III colorectal cancer patients. Clinical cancer
research : an official journal of the American Association for Cancer Research
2012;18(21):6001-10 doi 10.1158/1078-0432.ccr-11-3302.
49. Salazar R, Roepman P, Capella G, Moreno V, Simon I, Dreezen C, et al. Gene expression
signature to improve prognosis prediction of stage II and III colorectal cancer. Journal of
clinical oncology : official journal of the American Society of Clinical Oncology
2011;29(1):17-24 doi 10.1200/jco.2010.30.1077.
50. Kalady MF, Dejulius K, Church JM, Lavery IC, Fazio VW, Ishwaran H. Gene signature is
associated with early stage rectal cancer recurrence. J Am Coll Surg 2010;211(2):187-95 doi
10.1016/j.jamcollsurg.2010.03.035.
51. Peng J, Wang Z, Chen W, Ding Y, Wang H, Huang H, et al. Integration of genetic signature and
TNM staging system for predicting the relapse of locally advanced colorectal cancer. Int J
Colorectal Dis 2010;25(11):1277-85 doi 10.1007/s00384-010-1043-1.
52. Watanabe T, Kobunai T, Yamamoto Y, Kanazawa T, Konishi T, Tanaka T, et al. Prediction of liver
metastasis after colorectal cancer using reverse transcription-polymerase chain reaction
analysis of 10 genes. Eur J Cancer 2010;46(11):2119-26 doi 10.1016/j.ejca.2010.04.019.
53. Hao JM, Chen JZ, Sui HM, Si-Ma XQ, Li GQ, Liu C, et al. A five-gene signature as a potential
predictor of metastasis and survival in colorectal cancer. J Pathol 2010;220(4):475-89 doi
10.1002/path.2668.
54. Watanabe T, Kobunai T, Sakamoto E, Yamamoto Y, Konishi T, Horiuchi A, et al. Gene
expression signature for recurrence in stage III colorectal cancers. Cancer 2009;115(2):283-92
doi 10.1002/cncr.24023.
55. Garman KS, Acharya CR, Edelman E, Grade M, Gaedcke J, Sud S, et al. A genomic approach to
colon cancer risk stratification yields biologic insights into therapeutic opportunities.
Proceedings of the National Academy of Sciences of the United States of America
2008;105(49):19432-7 doi 10.1073/pnas.0806674105.
56. Lin YH, Friederichs J, Black MA, Mages J, Rosenberg R, Guilford PJ, et al. Multiple gene
expression classifiers from different array platforms predict poor prognosis of colorectal
cancer. Clinical cancer research : an official journal of the American Association for Cancer
Research 2007;13(2 Pt 1):498-507 doi 10.1158/1078-0432.ccr-05-2734.
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
23 / 26
Table 1 Genomic Signatures Included in the Study
First
author Prognosis
Number of genes
Patients with the signature classified as (%)a
Reference Original signature Available genes (%)
Poor Uncertainty Good
Aziz OS 19 15(78.9%)
201(23.5%) 349(40.8%) 305(35.7%) (42)
Chang DFS 95 95(100.0%)
341(39.9%) 299(35%) 215(25.1%) (9)
Chen OS 7 6(85.7%)
293(34.3%) 265(31%) 297(34.7%) (12)
Shi DFS 9 9(100.0%)
396(46.3%) 300(35.1%) 159(18.6%) (43)
Xu OS 15 13(86.7%)
299(35%) 492(57.5%) 64(7.5%) (44)
Minh DFS 77 74(96.5%)
358(41.9%) 366(42.8%) 131(15.3%) (45)
Shen DFS 45 32(71.1%)
335(39.2%) 338(39.5%) 182(21.3%) (46)
Teodoro DFS 4 4(100.0%)
0(0.0%) 855(100%) 0(0.0%) (47)
Agesen DFS 13 13(100.0%)
74(8.7%) 688(80.5%) 93(10.9%) (10)
Popovici DFS 64 56(87.5%)
333(38.9%) 140(16.4%) 382(44.7%) (14)
Seveen DFS 7 7(100.0%)
140(16.4%) 598(69.9%) 117(13.7%) (48)
Merlos DFS 264 247(93.6%)
156(18.2%) 235(27.5%) 464(54.3%) (23)
Oh DFS, OS 87 81(93.1%)
278(32.5%) 269(31.5%) 308(36%) (13)
Salazar DFS 18 16(88.9%)
192(22.5%) 326(38.1%) 337(39.4%) (49)
Kalady DFS, OS 36 24(66.7%)
245(28.7%) 429(50.2%) 181(21.2%) (50)
Peng DFS 8 7(87.5%)
120(14%) 453(53%) 282(33%) (51)
Toshiaki DFS 10 9(90.0%)
323(37.8%) 232(27.1%) 300(35.1%) (52)
Fritzmann DFS 113 100(88.5%)
568(66.4%) 166(19.4%) 121(14.2%) (27)
Hao DFS 4 4(100.0%)
188(22%) 577(67.5%) 90(10.5%) (53)
Jorissen DFS 122 102(83.6%)
370(43.3%) 231(27%) 254(29.7%) (53)
Smith DFS, OS 34 29(85.3%)
337(39.4%) 395(46.2%) 123(14.4%) (11)
Watanabe DFS 31 22(71.0%)
156(18.2%) 432(50.5%) 267(31.2%) (54)
Garman DFS, OS 46 37(80.4%)
193(22.6%) 336(39.3%) 326(38.1%) (55)
Jiang DFS 7 6(85.7%)
216(25.3%) 372(43.5%) 267(31.2%) (26)
Lin.a DFS 16 14(87.5%)
141(16.5%) 518(60.6%) 196(22.9%) (56)
Lin.b DFS 22 18(81.8%)
319(37.3%) 324(37.9%) 212(24.8%) (56)
Wang DFS 19 19(100.0%)
226(26.4%) 507(59.3%) 122(14.3%) (46)
aPatients were classified as good, poor or uncertain by the respective published genomic
signatures based on prediction results (false discover rate [FDR] <0.05) of the nearest
template prediction (NTP).
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
24 / 26
Figure legends
Figure 1. Bioinformatic analysis of gene lists included in this study. A. Landscape
distribution characteristics of the gene lists as reported in 27 prognostic signatures.
Genes with good and poor prognostic values in each signature are shown as short
green and red vertical bars, respectively. The number of gene occurrences is
graphically depicted in the upper panel. Each row represents one signature annotated
by the first author of the corresponding article. All the unique genes were sorted
alphabetically. B-E. GO and KEGG pathway analyses of genes in the 27 signatures.
The vertical axis represents Go or KEGG pathway annotations. The horizontal axis
represents the number of genes assigned to the corresponding annotation. (B)
Biological process; (C) cellular components; (D) molecular function; and (E) KEGG
pathway. F. Protein-protein interaction (PPI) networks. Nodes represent the proteins,
and edges represent the physical interaction. The black and gray nodes represent
genes with good and poor prognostic values, respectively. CRC recurrence-related
pathways that were significantly enriched by KEGG analysis form some closely
connected regions and are highlighted in the colored box.
Figure 2. Global prognostic performance of the published CRC signatures. A.
Concordance of signature-based prediction results in the training set. Each column
represents the prediction of each individual sample. The orange, blue and white bars
indicate good, poor and uncertainty prognoses of the corresponding signature,
respectively. B. Heatmap of Cramer’s V coefficients showing correlation between
these published signatures. C. Analysis of prognostic performance using Cox
regression; left: univariate analysis; right: clinical factors adjusted for multivariate
analysis.
Figure 3. Development of the composite clinicopathologic-genomic nomogram. A.
Composite nomogram to predict disease-free survival for patients with CRC. Each
category within these variables was assigned to a point on the top scale based on the
coefficients from Cox regression. By summing all of the assigned points for the seven
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
25 / 26
variables and drawing a vertical line between the total points and survival probability
axis, we easily obtained the estimated probabilities of 1-, 3- and 5-year disease-free
survival (DFS). B. Calibration curves for predicting patient survival at 1, 3 and 5
years. The actual DFS is plotted on the y-axis, and the nomogram-predicted DFS is
plotted on the x-axis; a plot along the 45-degree line would indicate a perfect
calibration model in which the predicted probabilities are identical to the actual
outcomes. C. The total point distribution of each CRC patient and their disease-free
survival time and status. The blue dotted line represents the cutoff value dividing
patients into low-, intermediate- and high-risk groups. D. Kaplan-Meier survival
curves of DFS for different risk groups using the nomogram in the training set. E.
Time-dependent ROC curves comparing the prognostic accuracies of 5-year
disease-free survival among the prognostic signatures combined with
clinicopathological features and the nomogram in the training set. The AUC 95% CIs
were calculated from 1000 bootstraps of the survival data. ROC, receiver operator
characteristic. AUC, area under the curve.
Figure 4. Verifying the prognostic value of the nomogram in the validation set. A.
The total point distribution of each CRC patient and their disease-free survival time
and status. The blue dotted line represents the cutoff value dividing patients into low-,
intermediate- and high-risk groups. B. Kaplan-Meier survival curves of DFS for
different risk groups using the nomogram in the validation set. C. Time-dependent
ROC curves comparing the prognostic accuracies of 5-year disease-free survival
among the prognostic signatures in combination with clinicopathological features and
the nomogram in the validation set. D, E. Kaplan-Meier survival curves for patients in
different risk subgroups, which were stratified by TNM stages II (D) and III (E).
Figure 5. Effect of chemotherapy on overall survival in patients stratified by the
nomogram. A. Kaplan-Meier survival curves for all patients who received
chemotherapy. B-D. Effects of adjuvant chemotherapy on overall survival in the
different subgroups. Low-risk group (B); intermediate risk group (C); high-risk group
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
26 / 26
(D).
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063
Published OnlineFirst May 21, 2018.Mol Cancer Res Yongfu Xiong, Wenxian You, Min Hou, et al.
CancerFeatures Improves Prognosis Prediction for Colorectal Nomogram Integrating Genomics with Clinicopathological
Updated version
10.1158/1541-7786.MCR-18-0063doi:
Access the most recent version of this article at:
Material
Supplementary
http://mcr.aacrjournals.org/content/suppl/2018/05/19/1541-7786.MCR-18-0063.DC1
Access the most recent supplemental material at:
Manuscript
Authorbeen edited. Author manuscripts have been peer reviewed and accepted for publication but have not yet
E-mail alerts related to this article or journal.Sign up to receive free email-alerts
Subscriptions
Reprints and
To order reprints of this article or to subscribe to the journal, contact the AACR Publications
Permissions
Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)
.http://mcr.aacrjournals.org/content/early/2018/05/19/1541-7786.MCR-18-0063To request permission to re-use all or part of this article, use this link
on July 25, 2020. © 2018 American Association for Cancer Research. mcr.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on May 21, 2018; DOI: 10.1158/1541-7786.MCR-18-0063