Prognostic Value of Highly Expressed Type VII Collagen (COL7A1) in Patients With Gastric Cancer

Collagen is a major component in the tumor microenvironment. This study reveals a novel biomarker candidate, type VII collagen (COL7A1), in patients with gastric cancer. To identify genes differentially expressed in gastric cancer tissue, we analyzed cancerous (n = 20) and noncancerous tissues (n = 13) using a DNA microarray. To perform immunohistochemistry and validate the upregulation of COL7A1 expression, we collected 200 more gastric cancer tissues and 100 normal gastric tissues from 200 randomly selected patients who underwent gastrectomy for gastric cancer between January 2010 and December 2013. The correlations between COL7A1 expression and clinicopathological parameters and patients’ overall survival (OS) were analyzed. In the microarray, COL7A1 was upregulated in gastric cancer tissue compared with normal tissue. In the immunohistochemistry study, COL7A1 was more highly expressed in cancer tissue than in normal tissue (p = 0.001). Patients with intracellular COL7A1 expression had significantly poorer five-year OS than those with only extracellular expression (41.5 versus 69.7%, p = 0.001), and the site of expression was an independent prognostic factor of OS (hazard ratio 2.00, 95% CI 1.26–3.16, p = 0.003). Also, we found a significant association between the COL7A1 immunohistochemistry score and distant metastasis (high versus low, odds ratio 4.45, 95% CI 1.40–14.16, p = 0.011). The site and total immunohistochemistry score of COL7A1 expression in gastric cancer showed prognostic significance for OS and distant metastasis, respectively. COL7A1 could be a novel biomarker with diagnostic and therapeutic value.


INTRODUCTION
With the development of diagnostic techniques for early detection [1] and adjuvant treatments [2,3], the long-term survival rate of gastric cancer patients in Korea improved between 2011 and 2015 [4]. However, gastric cancer remains one of the most common cancers in Korea, and it causes high cancer-related mortality worldwide [5]. To overcome its poor prognosis and the limitations of palliative treatment for unresectable, metastatic, or recurrent gastric cancer, researchers and clinical physicians are focusing on the tumor microenvironment in their search for novel biomarkers or a target therapy.
The tumor microenvironment is deeply connected to tumor progression and metastasis in gastric cancer [6]. Among the various components of the tumor microenvironment, collagen is a major structure in the extracellular matrix. Twenty different collagen types have been identified so far [7]; among them, the association with gastric cancer has been studied only for type I. Previous studies revealed that increases in the expression of type I correlate with poor clinical outcomes in gastric cancer patients [8,9].
Unlike fibril-forming collagens (type I) and basement membrane collagens (type IV), type VII collagen (encoded by the COL7A1 gene) functions as the anchoring fibrils for basement membranes and is mostly distributed at the dermal-epidermal junction of skin, the oral mucosa, and the cervix, where it is covered with stratified squamous epithelium [7]. COL7A1 mutations cause an incurable, potentially fatal skin disease called dystrophic epidermolysis bullosa, which is characterized by chronic skin fragility and blistering [10]. Cutaneous squamous cell carcinoma shows an increase in aggressive behavior and metastatic potential when mutated COL7A1 loses its function [11]. In esophageal squamous cell carcinoma, COL7A1 expression correlates with the depth of tumor invasion, lymphatic invasion, and prognosis [12]. The clinicopathological significance of unregulated COL7A1 expression in gastric cancer has not yet been studied.
Our aim in this study was to reveal the clinical significance and prognostic effects of a newly found upregulation of COL7A1 expression in gastric cancer patients. First, we explored many genes differentially expressed in gastric cancer tissue, including COL7A1, with a DNA microarray. Second, we performed immunohistochemistry to evaluate the levels of COL7A1 expressed in gastric cancer tissue and adjacent normal tissue. Third, we further investigated the association between the clinicopathologic variables of gastric cancer patients and COL7A1 expression and evaluated the prognostic significance of COL7A1 expression levels.

Microarray Gene Analysis
A microarray gene analysis (HumanHT-12 v4 BeadChip, Illumina, CA, United States) was performed using gastric cancer (n 20) and normal gastric tissues (n 13). This array contains 47,231 probes designed to cover content from NCBI RefSeq Release 38 (November 7, 2009), as well as legacy UniGene content. Briefly, complementary DNA was synthesized from tissue RNA and directly hybridized to the microarray according to the manufacturer's instructions (LAS Inc., Seoul, Korea). Signals from the prepared microarray were scanned, and we used the fluorescent intensities to generate data for the whole genes detected. We sifted the meaningful gene data by subtracting background signals. In addition, to show similarities in the expression of 4,509 genes among the 33 samples, hierarchical clustering and pairing of the gene expression profiles was performed. After the application of additional statistical criteria, we identified differentially expressed genes (DEGs). A functional annotation analysis of each gene list was performed using the functional annotation tool in the Database for Annotation, Visualization, and Integrated Discovery (DAVID) [13]. DAVID is a comprehensive set of functional annotation tools and has been used for the systematic and integrative analysis of large gene lists. For this work, we highlighted the most relevant terms from each functional annotation category in the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and visualized the clustered DEGs in a heatmap plot using an inhouse R script. We used the color key, which is log 10 (fragments per kilobase of transcripts per million mapped reads, FPKM +1) values to compare gene expression across genes and samples.

Immunohistochemistry Staining
Tissue was embedded in Tissue-Tek ® OCT compound (Sakura, United States) and frozen in liquid nitrogen. For staining, 7 µm cryosections of each tissue were air-dried. After fixation with masked formalin (10% non-buffered fluid, DN-4301, DANA, Korea for 10 min) and permeabilization (0.1% Triton X-100 in peroxidase blocking solution for 15 min at room temperature (RT)), we applied 200 µL of peroxidase blocking solution per slide for 15 min at RT. After applying 200 µL per slide of 1:200 diluted primary antibody (Abnova, PAB19138, United States), we incubated the slides overnight at 4°C. Next, 200 µL of antibody enhancer per slide (Polink-2 Plus Rabbit HRP kits, GBI (Golden Bridge International), United States) was applied and incubated for 30 min at RT. After that, we applied 200 µL of polymer per slide (Polink-2 Plus Rabbit HRP kits, GBI, United States) and incubated them for another 30 min at RT. Finally, immunostaining was done with diaminobenzidine solution (DAKO, Agilent, United States). Between steps, the slides were washed three times for 5 min in PBS. The slides were then counterstained with hematoxylin (ScyTek, United States) and mounted.

Grading of COL7A1 Expression
The final immunohistochemistry staining results were scored by one pathologist who was blinded to the clinicopathological characteristics of the patients. The scores were determined as the sum of the intensity and extent of staining, as suggested by a previous study [14]. Staining intensity was scored as 0 (negative), 1 (weakly positive), 2 (moderately positive), or 3 (strongly positive). The extent of staining was scored as the percentage of the area with positive staining: 0 (0-10%), 1 (11-30%), 2 (31-50%), 3 (51-70%), and 4 (71-100%). The tissue staining scores were determined as the sum of the score of the intracellular area (cancer cell or glandular cell; range: 0-7) and the extracellular area (stroma; range: 0-7). The tissue staining scores (range: 0-14) of type VII collagen expression were then divided into two groups for comparison and analysis: the high expression group (total score ≥ 8) and low expression group (total score < 8). We also categorized the patients into two groups by the area of staining: patients whose tissues were stained only at the extracellular matrix and patients whose tissues were also stained at the intracellular space. Because the staining was not even across each slide, the pathologist interpreted the intensity and extent by comparing each sample with negative control (definite: lymphocyte; relative: normal gastric epithelial cell, smooth muscles, nerve fibers) and positive control (fibrocollagenous stroma tissue, fibrohistiocytes) areas of each slide.

Statistical Analysis
Gene data analysis software (GenomeStudio Gene Expression Module v1.6+, Illumina, CA, United States) was used in the microarray study. The scanned genes in all samples were prefiltered to eliminate genes with missing signals, signal intensity < 0, or a detection p value > 0.05. DEGs were selected using |fold change| ≥ 2 and Q value ≤ 0.005. Expression quantification and quantile normalization were performed before the DEG analysis. The Mann-Whitney test was used with significance set at p < 0.05, and the Benjamini-Hochberg test was used for multiple testing correction.
The two hundred patients were randomly selected for additional tissue collection through a statistics program. Because no previously reported criteria were available to define cut-off values for the immunohistochemistry scores, we used the Contal and O'Quigley method [15], which maximizes the logrank statistic, to find the optimal cut-off values for the overall FIGURE 1 | Heatmap for log 10 (fragments per kilobase of transcripts per million mapped reads, FPKM) values to compare gene expression across genes and samples. After pairing and clustering cancerous and noncancerous tissues, 1,662 differentially expressed genes were selected. During the functional annotation analysis, seven genes (COL1A1, COL7A1, CPA2, CPA3, PGA3, PGA4, and PGA5) were mappable to a protein digestion and absorption pathway (KEGG pathway: hsa 04,974). Among them, COL7A1 and COL1A1 were upregulated, and the others were downregulated [COL7A1: log2 (FC) 3.679, p < 0.05, Q < 0.005; COL1A1: log2 (FC) 3.281, p < 0.05, Q < 0.005]. When we focused on the genes in the yellow box (A), COL7A1 in cancer tissue was highly expressed in the heatmap (B). As COL7A1 was highly expressed in tumor tissues than in normal gastric tissues, we selected the gene as the target gene of our study. As for the color key, a log 10 (FPKM+1) value of 1 is shown black because it is mostly the peak value of the FPKM density plot. For this plot, a value of 1 is added to the FPKM to avoid zero-based errors. Red squares, high expression; green squares, low expression; black squares, no difference. survival analysis. We used the cut-off value ( 8) to categorize patients into high and low score groups (p 0.132). To investigate the clinicopathologic factors of the categorized patients, we used the χ 2 or Fisher exact test for categorical variables and the Mann-Whitney test for continuous variables.
Five-year overall survival (OS) was calculated using the Kaplan-Meier method. The log-rank test was used for the univariate analysis. A Cox proportional hazards model with the backward logistic regression method was used for the multivariate analysis of variables with a p < 0.05 in the univariate analysis. Last, we used a backward logistic regression to find statistically significant risk factors for distant metastasis. This model was evaluated using the area under the receiver operating characteristics curve (AUC) to measure its predictive accuracy. The statistical analyses, including the random selection of patients, were processed using SPSS version 25.0 for Windows (SPSS, Chicago, IL), SAS version 9.4 (SAS Institute, Cary, NC), and R 3.6.1 (Vienna, Austria; http://www.R-project.org/).

Gene Analysis in Gastric Cancer Tissue
After initial scanning of the fluorescent signals, we found 34,602 genes in the tumor and normal tissues. After pairing and clustering cancerous and noncancerous tissues, the accumulated count of DEGs that survived filtering for technical reliability (no missing signals, signal intensity ≥ 0, and detection p value ≤ 0.05) and gradually elevating statistical probability (|fold change, FC| ≥ 2 and Q value ≤ 0.005) was 1,662. Among them, 918 genes were upregulated, and 744 genes were downregulated in tumor tissue compared with normal tissue.
During the functional annotation analysis, seven genes (COL1A1, COL7A1, CPA2, CPA3, PGA3, PGA4, and PGA5) were mappable to a protein digestion and absorption pathway (KEGG pathway: hsa 04974). Among them, COL7A1 and COL1A1 were upregulated, and the others were downregulated [COL7A1: log2(FC) 3.679, p < 0.05, Q < 0.005; COL1A1: log2(FC) 3.281, p < 0.05, Q < 0.005]. When we visualized the gene expression result in a heatmap and magnified the portion in the yellow box ( Figure 1A), COL7A1 was upregulated in tumor tissue (red) compared with normal tissue (black, Figure 1B). As COL7A1 was highly expressed in tumor tissues than in normal gastric tissues, we selected the gene as the target gene of our study.

Immunohistochemistry of COL7A1 in Cancerous and Normal Tissue
The immunohistochemistry scores comparing cancer tissues (n 200) with normal gastric tissues (n 100) are shown in Table 1. The mean of total immunohistochemistry score of COL7A1 was higher in gastric cancer tissue than in normal stomach tissue (7.50 versus 6.93, p 0.001). The proportion of tissues with a total immunohistochemistry score of more than eight was significantly higher in tumors than in normal tissues (28.5 versus 15.0%, p 0.010). The score was higher in the stroma (extracellular) than in the nucleus or cytoplasm (intracellular) of both the main cancer cells and normal glandular cells ( Table 1; Figure 2 and Figure 3).

Correlations Between COL7A1 Expression in Cancer Tissue and the Clinicopathological Features of Patients
The clinicopathologic characteristics of patients in the COL7A1low and -high score groups are shown in Table 2. Fifty seven of the 200 patients had high COL7A1 scores (total score ≥ 8) in their cancer tissues (28.5%). The COL7A1-high group contained more female patients than the COL7A1-low group (47.4 versus 26.6%, p 0.005). Cases in which the tumor location was the whole stomach were more frequently found in the COL7A1-high group than in the COL7A1-low group (17.5 versus 5.6%, p 0.046). Among the twenty patients (20/200, 10%) diagnosed with distant metastases, twelve of them were in the COL7A1-high group (12/ 57, 21.1%), and eight were in the COL7A1-low group (8/143, 5.6%, p 0.001).
We compared the clinicopathologic characteristics according to the sites of COL7A1 expression (only extracellular expression versus extra-and intracellular expression), and the results are presented in Table 2. The total immunohistochemistry scores in the intracellular expression group were significantly higher than those in the  Table 3 shows the results of the univariate and multivariate OS analyses. When patients were categorized to COL7A1-high and -low groups using the total immunohistochemistry score cut-off value (8), the OS between the groups differed significantly (≥8 versus <8, five-year OS 43.9% versus 66.9%, log-rank test p 0.010, Figure 4A). When the patients were categorized by COL7A1 expression sites, the OS of the patients with intracellular expression was significantly poorer than that of the patients with only extracellular expression (five-year OS 41.5% versus 69.7%, log-rank test p 0.001, Figure 4B). In the multivariate analysis (Table 3), the site of COL7A1 expression was an independent prognostic factor (hazard ratio 2.00, 95% confidence interval 1.26-3.16, p 0.003), along with gross type (Borrmann type IV versus others, p 0.001), T stage (p 0.002), and lymphatic invasion (p 0.025).

DISCUSSION
To our knowledge, we are the first to reveal the association between COL7A1 expression and gastric cancer through immunohistochemistry and to confirm its prognostic significance. COL7A1 expression was denser in cancer tissue than in normal tissue, and most of that expression was found in the extracellular matrix. This high expression score of COL7A1 was an independent risk factor for distant metastasis. Interestingly, patients with intracellular COL7A1 expression showed poorer OS than those with extracellular COL7A1 expression, and the site of expression was an independent prognostic factor for OS.  Using microarray technology, a previous gene expression analysis in gastric cancer found that the expression of several collagen genes increased in gastric cancer tissue, and COL7A1 was one of them (Cancer 119.1, Normal 46.7, Cancer/Normal 2.6, p < 0.05) [16]. In this study, we also began with a DNA microarray to explore the expression of various genes in gastric cancer and found that two types of collagen were highly expressed in cancer tissues: COL7A1 and COL1A1. Because COL1A1 is a common type of collagen and had been studied before [8], we chose COL7A1 for further investigation as a candidate prognostic biomarker. To investigate the distribution of the final target protein (COL7A1), which is a component of the tumor microenvironment, and determine its prognostic significance, we decided to use immunohistochemistry, which is an easy, cost effective, and widely available method that every pathology laboratory can perform [17].
Currently, an interesting theory is being investigated: that cancer is not only a problem of tumor cells but also a disease that causes a complicated, imbalanced environment surrounding the tumors. In this regard, cancer research has increasingly shifted to the tumor microenvironment. Among the many components of the tumor microenvironment, collagen provides the major structural framework of the extracellular matrix and can thus aid or restrain tumor progression [18].
In experiments with gastric cancer, fibroblasts around peripherally located tumor cells increased their collagen synthesis, reflecting a desmoplastic reaction to the cancer [19,20]. Similar to previous studies, our immunohistochemistry results confirm that collagen synthesis and extracellular deposition increase in cancer tissue. Further research is needed to determine how cancer cells induce the fibroblasts to produce unusual types of collagen, especially type VII, and the mechanism by which type VII collagen contributes to cancer progression. Several explanations are possible. Increased tumor stromal collagen could block infiltrating cytotoxic immune cells and allow the tumor to escape from the host immune system in gastric cancer [21]. A second possibility is that, because type VII collagen functions as anchoring fibrils [7] between the cancer epithelium and the stroma, increased COL7A1 expression could contribute to cancer cell invasion. Previous research investigated the clinical and pathophysiological associations between various types of collagen (though not type VII collagen) and gastric cancer. The previously mentioned study of type I collagen, one of the most widespread and abundant collagens, found that it could be a candidate prognostic factor in gastric cancer. Those researchers used real-time quantitative PCR to evaluate the expression of COL1A1 and COL1A2 in tissue. The mRNA expression was high in advanced cancer tissue, and having high expression correlated with low OS [8]. In scirrhous type gastric cancer, miR-143 is a critical mediator of collagen type III expression, which could support the progression of fibrillar formation [22]. Probably, increased collagen expression correlates with the scirrhous type of gastric cancer. Another study that performed a microarray meta-analysis found that type VI collagen (COL6A3) was regularly overexpressed in gastric cancer cells and suggested that gene could act as an oncogene [23].
Above all, we also found intracellular expression of COL7A1 in tumor cells. That phenomenon was especially frequent in female patients and signet ring cell type cancers. In addition, the total score for COL7A1 expression and prevalence of distant metastasis were also significantly higher in cancers with intracellular expression than in those with only extracellular expression. In the multivariate analysis, the site of COL7A1 expression (intra-or extracellular) was a significant prognostic factor for five-year OS. Signet ring cell type cancer frequently had intracellular expression of COL7A1, and cancers with intracellular expression showed high overall  immunohistochemistry scores compared with cancers with only extracellular expression of COL7A1. The intracellular expression of this unusual type of collagen in gastric cancer might thus indicate the aggressiveness of the cancer itself, leading to poor patient prognosis. It seems like that the production of collagen in the intracellular space of cancer cells might be associated with a dedifferentiation process such as an epithelial-mesenchymal transformation.
The presence of distant metastasis was 91.5% predictable using the total expression score and other variables in a logistic regression analysis. Because collagen is mostly produced by fibroblasts, this result could support the findings of other studies regarding the role of gastric cancer-associated fibroblasts in cancer progression and metastasis [24,25]. Furthermore, we have provided clinical evidence for further experiments about how this uncommon type of collagen triggers or contributes to distant metastasis, especially peritoneal dissemination. COL7A1 might be not only a novel prognostic biomarker but also a therapeutic target, especially when it is intracellularly expressed. Because type VII collagen is mostly distributed in human skin, systemic therapy targeting type VII collagen should be avoided due to possible dermatologic complications. In this study, high expression of COL7A1 and distant metastasis, especially in the peritoneum, were strongly associated. Therefore, an intraperitoneal approach to target therapy in selected patients, such as females, those diagnosed with signet ring cell type cancer, and those with peritoneal metastasis, could be possible.
Although we used more tissues for staining when compared to other studies, the quality of staining might be poor as we could not use fresh tissues or formaldehyde-paraffin fixed tissues other than frozen tissues due to retrospective character of this study. Also, nuclear staining of COL7A1 needs functional explanation to clearly exclude nonspecific reactions. It remains unclear what precisely is showing antigenicity in the nuclei of tumor cells. Cross-reactivity to other antigen could be a plausible explanations for this observation. This may have happened during our experiment, however, we did not identify nuclear staining in normal tissues similar to that shown in cancer tissues with various intensities. Therefore, we analyzed the cases for general intracellular staining including both cytoplasmic and nuclear staining. Furthermore, to confirm the result of this study, we need more validation studies with tissues of good quality to reveal the biological interaction and mechanism between COL7A1 and tumor cells.

CONCLUSION
In conclusion, we have shown that COL7A1 is overexpressed in gastric cancer and confirmed the clinical and prognostic significance of the intracellular expression of COL7A1. COL7A1 might be one of the tumor microenvironment components that contributes to cancer progression and distant metastasis. It could be a useful biomarker for diagnosis and a therapeutic target. Further investigations are needed to clarify the complex mechanisms of carcinogenesis and the interaction between COL7A1 and various components of the tumor microenvironment.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Institutional Review Board of Samsung Medical Center, Seoul, Korea. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
M-GC and K-MK contributed to the conception of this study and revised it critically. SO and MO collected and analyzed the data and drafted the manuscript. MO and SO conducted the main experiment. JA, JL, TS, and JB ensured that questions related to the accuracy or integrity of all parts of the work were appropriately investigated and resolved. All authors approved the final version of the manuscript to be published.

FUNDING
This research was supported by a grant from the Korea Health Technology R&D Project through the Korea Health Industry