These authors have contributed equally to this work and share first authorship.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
It is widely acknowledged that metastasis determines the prognosis of pancreatic adenocarcinoma (PAAD), and the liver is the most primary distant metastatic location of PAAD. It is worth exploring the value of liver-metastasis-related genetic prognostic signature (LM-PS) in predicting the clinical outcomes of PAAD patients post R0 resection. We collected 65 tumors and 165 normal pancreatic data from The Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression project (GTEx), respectively. Differentially expressed genes (DEGs) between primary tumor and normal pancreatic samples were intersected with DEGs between primary tumor samples with liver metastasis and those without new tumor events. The intersected 45 genes were input into univariate Cox regression analysis to identify the prognostic genes. Thirty-three prognostic liver-metastasis-related genes were identified and included in least absolute shrinkage and selection operator (LASSO) analysis to develop a seven-gene LM-PS, which included six risk genes (ANO1, FAM83A, GPR87, ITGB6, KLK10, and SERPINE1) and one protective gene (SMIM32). The PAAD patients were grouped into low- and high-risk groups based on the median value of risk scores. The LM-PS harbored an independent predictive ability to distinguish patients with a high-risk of death and liver metastasis after R0 resection. Moreover, a robust prognostic nomogram based on LM-PS, the number of positive lymph nodes, and histologic grade were established to predict the overall survival of PAAD patients. Besides, a transcription factor‐microRNA coregulatory network was constructed for the seven LM-PS genes, and the immune infiltration and genomic alterations were systematically explored in the TGCA-PAAD cohort.
Pancreatic adenocarcinoma (PAAD) is one of the most lethal malignancies with strong metastatic ability. It is predicted to become the second most common cause of cancer-related death within the next decade [
The hepatic metastasis process of PAAD involves complicated steps such as the adhesion of dissociated pancreatic cancer cells toward the liver, the formation of the remodeled extra-cellular matrix (ECM), the angiogenesis for micro-metastasis, and the construction of immune escape [
In this study, we integrated the mRNA expression data of normal pancreas and PAAD tissues from the GTEx and the TCGA datasets, respectively. A seven-gene prognostic signature was constructed with the ability to predict both OS and liver metastasis for PAAD patients following R0 resection.
Patient inclusion criteria in the TCGA dataset were [
The 65 PAAD patients in the TCGA cohort were training dataset. The whole TCGA-PAAD dataset with 147 patients (cases with a survival period ≥30 days and complete gene expression data), GEO (
Based on the threshold of |logFC| > 1 and
Prognostic Gene Signature Construction Based on Liver-metastasis-related mRNAs.
Univariate regression Cox analysis was performed to filter mRNAs associated with OS. Then the identified mRNAs were included in the least absolute shrinkage and selection operator (LASSO) Cox regression model to develop a liver-metastasis-related prognostic signature (LM-PS) for the PAAD patients involving seven liver-metastasis-related genes and derive the regression coefficient of each gene by “glmnet” and “survival” R packages. Thereafter, risk scores for patients were calculated according to the following formula:
In the TCGA dataset, the liver-metastasis-related prognostic signature and corresponding clinicopathological factors of the PAAD patients were included in univariate/multivariate Cox proportional hazards analysis and logistic regression model analysis to identify the independent predictors of OS and liver metastasis, respectively. A nomogram was established based on the independent predictors of OS with the “rms” R package. The receiver operating characteristic (ROC) curves and the area under the curve (AUC) values were used to evaluate the predictive ability of the established models in this study.
Based on the threshold of |logFC| > 1 and
The infiltration levels of the 22 immune cells were determined to assess the tumor microenvironment between high- and low-risk groups and between liver-metastasis and non-new-tumor groups in the TGCA dataset by the CIBERSORT R script v1.03. A
The cBioPortal website (
The seven liver-metastasis-related genes identified by the Lasso Cox regression model were uploaded to NetworkAnalyst (
PAAD patients in the TCGA cohort were divided into high- and low-risk groups based on the median value of the risk scores. Mean ± standard deviation or medians (with interquartile range), and counts with percentages were used to present continuous variables and categorical variables, respectively. Continuous variables were compared by Student’s
This study included 65 PAAD patients with R0 resection from the TCGA dataset, and the corresponding clinicopathological factors of the patients were presented in
Clinicopathological factors of the PAAD patients in TCGA dataset.
Clinical features | Patients with no new tumor (n = 50) | Patients with liver metastasis (n = 15) |
|
---|---|---|---|
Age at diagnosis (years), mean ± SD | 61.16 ± 11.1 | 62.9 ± 9.3 | 0.577 |
Gender, n (%) | 0.548 | ||
Female | 19 ( |
7 (46.7) | |
Male | 31 (62) | 8 (53.3) | |
Number of positive lymph nodes, median (interquartile range) | 1 (0–3) | 2 (0-5) | 0.357 |
Survival status, n (%) | <0.001 | ||
Alive | 40 (80.0) | 3 (20.0) | |
Dead | 10 (20.0) | 12 (80.0) | |
Histologic grade, n (%) | 0.473 | ||
G1 | 11 (22.0) | 2 (13.3) | |
G2 | 23 (46.0) | 5 (33.3) | |
G3 | 14 (28.0) | 7 (46.7) | |
G4 | 1 (2.0) | 0 | |
GX | 1 (2.0) | 1 (6.7) | |
Pathologic stage, n (%) | 0.746 | ||
Stage I | 8 (16.0) | 1 (6.7) | |
Stage II | 39 (78.0) | 14 (93.3) | |
Stage III | 0 | 0 | |
Stage IV | 1 (2.0) | 0 | |
Not available | 2 (4.0) | 0 | |
Pathologic T stage, n (%) | 0.441 | ||
T1 | 5 (10.0) | 0 | |
T2 | 8 (16.0) | 1 (6.7) | |
T3 | 35 (70.0) | 14 (93.3) | |
TX | 2 (4.0) | 0 | |
Pathologic N stage, n (%) | 1.000 | ||
N0 | 16 (32.0) | 5 (33.3) | |
N1 | 31 (62.0) | 10 (66.7) | |
NX | 3 (6.0) | 0 | |
Pathologic M stage, n (%) | 0.019 | ||
M0 | 17 (34.0) | 11 (73.3) | |
M1 | 1 (2.0) | 0 | |
MX | 32 (64.0) | 4 (26.7) |
PAAD, pancreatic adenocarcinoma; TCGA, The Cancer Genome Atlas; SD, standard deviation.
Study flow chart.
To build the LM-PS for forecasting the OS of PAAD patients, we included the expression of the 45 liver-metastasis-related genes into the univariate Cox regression analysis and identified 33 genes significantly associated with OS (
Identification of a 7-gene signature for OS by LASSO regression analysis in the TCGA-PAAD cohort
Information of the seven genes in the liver-metastasis-related prognostic signature.
Gene symbol | Description | Ensemble ID | Location (GRCh38/hg38) |
---|---|---|---|
ANO1 | Anoctamin-1 | ENSG00000131620 | chr11:69,985,907–70,189,530 |
FAM83A | Family with sequence similarity 83 | ENSG00000147689 | chr8:123,178,960–123,210,079 |
GPR87 | G protein-coupled receptor 87 | ENSG00000138271 | chr3:151,294,086–151,316,820 |
ITGB6 | Integrin beta-6 | ENSG00000115221 | chr2:160,099,666–160,200,313 |
KLK10 | Kallikrein-10 | ENSG00000129451 | chr19:51,012,739–51,020,175 |
SERPINE1 | Serine protease inhibitor, clade E member 1 | ENSG00000106366 | chr7:101,127,104–101,139,247 |
SMIM32 | Small integral membrane protein 32 | ENSG00000271824 | chr5:136,191,468–136,193,162 |
The patients in the training dataset were divided into two groups based on the median expression of each liver-metastasis-related gene. The Kaplan-Meier curves showed that high expression of ANO1, FAM83A, GPR87, ITGB6, KLK10, SERPINE1, and low expression of SMIM32 were significantly associated with worse patient survival (
Prognostic analysis of the seven liver-metastasis-related genes and the LM-PS
The distribution of clinicopathological factors with the risk score increasing is shown in
As expected, the Kaplan-Meier curve showed that the OS rate was remarkably lower in PAAD patients with liver metastasis after R0 resection than those without new tumor events (
The prognostic ability of the LM-PS to predict liver metastasis in PAAD patients
Logistic regression analysis of liver metastasis post R0 resection.
Factors |
|
Or (95% CI) |
---|---|---|
Risk score | 0.002 | 1.176 (1.061–1.302) |
Age | 0.571 | 1.016 (0.961–1.074) |
Gender | 0.549 | 0.700 (0.219–2.244) |
Histologic grade | 0.220 | 1.718 (0.724–4.077) |
Pathologic stage | 0.483 | 1.839 (0.336–10.081) |
Pathologic T stage | 0.129 | 5.200 (0.620–43.595) |
Pathologic N stage | 0.960 | 1.032 (0.301–3.537) |
Number of positive lymph nodes | 0.387 | 1.112 (0.875–1.413) |
OR, odds ratio; CI, confidence interval.
In the whole TCGA-PAAD cohort, the forest plot showed that high expression of ANO1, FAM83A, GPR87, ITGB6, KLK10, SERPINE1, and low expression of SMIM32 were significantly associated with worse patient survival (
Validation of the LM-PS in the TCGA, GEO, and ICGC Datasets
Univariate and multivariate Cox regression analyses were used to assess the prognostic value of the LM-PS in the TCGA-PAAD dataset. In the training set, the risk score and clinicopathological factors of PAAD patients were included in univariate Cox analysis, and the results showed that risk score was significantly associated with OS [hazard ratio (HR) = 1.132, 95% CI = 1.077–1.190,
Univariate and multivariate analyses of clinicopathological factors and LM-PS with OS in the training set.
Factors | Univariate analysis | Multivariate analysis | ||
---|---|---|---|---|
|
HR (95% CI) |
|
HR (95% CI) | |
Risk score | <0.001 | 1.132 (1.077–1.190) | <0.001 | 1.126 (1.060–1.197) |
Age | 0.056 | 1.045 (0.999–1.093) | – | – |
Gender | 0.556 | 1.300 (0.543–3.113) | – | – |
Histologic grade | 0.007 | 2.245 (1.245–4.048) | 0.017 | 2.530 (1.182–5.415) |
Pathologic stage | 0.042 | 1.240 (1.008–1.526) | 0.055 | 1.382 (0.993–1.923) |
Pathologic T stage | 0.023 | 4.227 (1.225–14.583) | 0.196 | 0.306 (0.051–1.845) |
Pathologic N stage | 0.016 | 3.791 (1.278–11.243) | 0.069 | 0.133 (0.015–1.173) |
Number of positive lymph nodes | 0.025 | 1.208 (1.025–1.425) | 0.005 | 1.689 (1.168–2.441) |
LM-PS, liver-metastasis-related prognostic signature; OS, overall survival; PAAD, pancreatic adenocarcinoma; TCGA, The Cancer Genome Atlas; HR, hazard ratio; CI, confidence interval.
Univariate and multivariate Cox regression analyses were also performed in the whole TCGA-PAAD dataset. The univariate Cox regression analysis showed that risk score was significantly associated with OS (HR = 2.668, 95% CI = 1.761–4.042,
Univariate and multivariate analyses of clinicopathological factors and LM-PS with OS in the whole TCGA-PAAD set.
Factors | Univariate analysis | Multivariate analysis | ||
---|---|---|---|---|
|
HR (95% CI) |
|
HR (95% CI) | |
Risk score | <0.001 | 2.668 (1.761–4.042) | <0.001 | 2.762 (1.680–4.541) |
Age | 0.014 | 1.033 (1.007–1.061) | 0.264 | 1.015 (0.989–1.042) |
Gender | 0.432 | 1.237 (0.728–2.102) | – | – |
Histologic grade | 0.001 | 2.016 (1.354–3.003) | 0.092 | 1.439 (0.943–2.196) |
Pathologic stage | 0.019 | 2.242 (1.139–4.411) | 0.866 | 1.111 (0.329–3.755) |
Pathologic T stage | 0.019 | 2.608 (1.171–5.811) | 0.600 | 0.736 (0.235–2.312) |
Pathologic N stage | 0.013 | 2.321 (1.194–4.512) | 0.840 | 1.090 (0.472–2.515) |
Number of positive lymph nodes | 0.023 | 1.076 (1.010–1.146) | 0.001 | 1.181 (1.066–1.309) |
LM-PS, liver-metastasis-related prognostic signature; OS, overall survival; PAAD, pancreatic adenocarcinoma; TCGA, The Cancer Genome Atlas; HR, hazard ratio; CI, confidence interval.
In the training cohort, the risk score, number of positive lymph nodes, and histologic grade were included to establish a clinically applicable quantitative nomogram to predict the OS of PAAD patients (
Construction and validation of an LM-PS-based nomogram in the TCGA dataset
Then the ROC curves were also plotted to assess the prognostic values of the nomogram and other independent predictors (risk score, histologic grade, and the number of positive lymph nodes) in the whole TCGA-PAAD dataset. The AUCs of the risk score, histologic grade, number of positive lymph nodes, and the nomogram were 0.753, 0.619, 0.665, and 0.815, respectively, indicating that the inclusion of risk score added to the accuracy of the other two factors included in the nomogram (
The landscape of immune cell infiltration of PAAD in the TCGA cohort was explored between low- and high-risk groups and between liver-metastasis and no-new-tumor groups using the CIBERSORT algorithm. The violin plots showed that patients in the high-risk subgroup harbored significantly higher prevalence in M0 macrophages (
The transcription factor‐microRNA coregulatory network based on ANO1, FAM83A, GPR87, ITGB6, KLK10, SERPINE1, and SMIM32 was obtained from NetworkAnalyst. This network contains the seven genes of the LM-PS, 63 transcription factors, and 37 microRNAs, with 86 associations between the LM-PS genes and transcription factors, and 38 associations between the LM-PS genes and microRNAs. As shown in
Transcription factor (blue squares)-microRNA (green squares) coregulatory network for the seven liver-metastasis-related prognostic genes (red circles).
We explored the genomic alteration for the TCGA-PAAD cohort in this study (data of 64 out of the 65 patients could be found on the cBioPortal website). In this study, 45.3% of the patients (29/64) had at least one LM-PS gene alteration with FAM83A, SERPINE1, and GPR87 being the most frequently altered genes (22%, 14%, and 11%, respectively;
The oncoprints of low-
GO and KEGG enrichment analyses were performed for the 408 DEGs between high- and low-risk subgroups to identify the most relevant biological processes and pathways. The GO annotations showed that the DEGs enriched in GO terms mainly related to ECM, cell adhesion, and locomotion such as ECM organization, positive regulation of locomotion, cell-substrate adhesion, cell adhesion molecule binding, cell junction organization, collagen binding, regulation of cell adhesion, and ECM binding (
GO
The current study was mainly based on the TCGA dataset. DEGs between primary PAAD tissues with liver metastasis and those without new tumor events were intersected with DEGs between primary tumor and normal pancreatic tissues, and 45 liver-metastasis-related genes were identified. Thirty-three out of the forty-five DEGs were identified to be significantly associated with the OS of PAAD patients, and seven of them (ANO1, FAM83A, GPR87, ITGB6, KLK10, SERPINE1, and SMIM32) were identified by the LASSO model to construct an LM-PS for predicting the OS of PAAD patients post R0 resection. The up-regulation of ANO1, FAM83A, GPR87, ITGB6, KLK10, and SERPINE1 was significantly associated with poor outcomes of PAAD patients, however, the up-regulation of SMIM32 was protective for the prognosis of PAAD patients. Based on the median value of risk scores generated from the LM-PS, PAAD patients were grouped into the low- and high-risk subgroups and the high-risk subgroup had a significantly lower OS rate than those in the low-risk subgroup. The prognostic value of the LM-PS in the TCGA dataset was validated in the GEO and ICGC datasets. Subsequently, the multivariate Cox analysis and logistic regression analysis confirmed that the LM-PS was a reliable predictor of OS and liver metastasis, respectively. Furthermore, a robust prognostic nomogram based on risk scores, the number of positive lymph nodes, and histologic grade were established to predict 1-, 2-, and 3-years OS for PAAD patients. In addition, immune infiltration analysis showed that only a few kinds of immune cell proportions were different between high- and low-risk subgroups. A transcription factor‐microRNA coregulatory network based on the seven LM-PS genes was constructed for viewing the potential regulatory mechanism of these genes. The cBioPortal was used to give an insight into the impact of genomic alterations on the prognosis of PAAD patients. The genetic alteration frequency in FAM83A, GPR87, and SMIM32 was significantly different between high- and low-risk groups. No significant difference was observed in patient survival between altered and non-altered subgroups. However, compared with those with non-alteration, patients with alterations of FAM83A and ITGB6 were significantly associated with lower OS, while patients with alterations of SMIM32 were significantly associated with higher OS. These results indicate that mutations of FAM83A, ITGB6, and SMIM32 may play important roles in tumor modulation. Additionally, the LM-PS gene altered subgroup was observed to harbor a gene mutation enrichment in KRAS, CDKN2A, and TP53.
The regulatory mechanism of the KRAS, TP53, and CDKN2A has been well discussed. It has been widely acknowledged that KRAS mutation is associated with the poor prognosis of patients with pancreatic cancer [
The modulation of most of the LM-PS genes in this study has been reported to be associated with the progression of pancreatic cancer. A Danish study reported that ANO1 (TMEM16A) was the main component of the Ca2+-activated Cl− channel, which was upregulated in PDAC cell lines. ANO1 was reported to be crucial for cell migration but was not significantly associated with cell proliferation in PDAC [
As a member of the family with sequence similarity 83 (FAM83), FAM83A is, remarkably, up-regulated in PAAD and significantly associated with poor clinical outcomes. Over-expression of FAM83A markedly promotes cancer stem cell traits and chemoresistance by activating the Wnt/β-catenin signaling and TGF-β signaling in pancreatic cancer [
As a G protein-coupled receptor, GPR87 is significantly over-expressed in PAAD tissues and is an independent risk predictor of OS. Elevated GPR87 expression promotes cancer stem cell expansion by regulating the JAK2/STAT3 pathway [
The Kallikreins family are a series of serine proteases modulating the proteolysis scene. They have been reported to be closely correlated with angiogenesis and metastasis [
As a crucial regulator in the plasminogen activator system, SERPINE1 encodes the plasminogen activator inhibitor and was reported to markedly modulate tumor invasion and proliferation and was negatively associated with the OS of PDAC patients [
The GO enrichment analysis showed that the DEGs were mainly enriched in GO terms associated with ECM, cell adhesion, and locomotion. Additionally, the most enriched KEGG pathway was also ECM-receptor interaction. As a reservoir of numerous signaling molecules, the ECM plays an essential role in the tumor microenvironment and has been found to promote metastasis by the degradation of 500–600 proteases [
Recent studies have demonstrated that a low fraction of naive B cells and a high fraction of M0 macrophages were correlated with the decreased OS of PDAC patients [
In addition, the transcription factors FOS, JUND, USF1, and MYC were found to regulate three LM-PS genes in the transcription factor‐microRNA coregulatory network. FOS and JUND belong to the Activator Protein 1 family [
The limitations of our study are first, that it was a retrospective study and, therefore, no adjustment could be made for confounding factors that might have influenced the clinical outcomes. Second, batch effects between TCGA and GTEx datasets cannot be eliminated, though they have been minimized and assessed. Third, limited samples of PAAD patients were included in this study, and prospective large sample studies are needed in the future.
In conclusion, our study constructed a seven-gene signature with the prognostic ability to identify PAAD patients with a high-risk of death and liver metastasis after R0 resection. Furthermore, a clinically applicable nomogram incorporating genetic features and clinicopathological factors was established to predict the OS of PAAD patients. The nomogram may facilitate an individual therapeutic strategy, early intervention, and delayed cancer progression in clinical practice.
The original contributions presented in the study are included in the article/
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.
DF and JL conceived and designed the study. YD, JT, and BY were involved in the acquisition, analysis, and interpretation of the data. YD and JT wrote the original draft of the manuscript. BY and KL prepared the figures and tables. DF and JL critically revised the manuscript for important intellectual content. All authors reviewed and approved the manuscript.
This work was supported by Chinesisch-Deutsche Kooperationsgruppe: Precision Medicine in Pancreatic Cancer (Grant No. GZ 1456), National Natural Science Foundation of China (Grant No. 81772566), and Shanghai Science and Technology Commission (Grant No. 19511121200). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We gratefully acknowledge contributions from the TCGA, GTEx, GEO, ICGC, cBioPortal, and HPA databases.
The Supplementary Material for this article can be found online at: