Abstract

Lymphangiogenesis, an integral contributor to lymphatic metastasis, is a significant reason for the poor prognosis of cancer patients. Anti-lymphangiogenesis treatment is a promising novel therapeutic direction, especially for tumors resistant to conventional therapies. We confirmed the ectopic expression of lymphangiogenesis-related genes (LRGs) in lung adenocarcinoma (LUAD) cohorts based on the TCGA database. We constructed a prediction signature with 15 LRG prognostic signatures (F2RL1, LOXL2, MKI67, PTPRM, GPI, POSTN, INHA, LDHA, LINC00857, ITGA2, PECAM1, SOD3, GDF15, SIX1, and FGD5), and the overall survival (OS) was significantly different between the high- and low-risk groups (TCGA-training: , TCGA-test: , GSE30219: , GSE37745: , and GSE50081: ). Moreover, the risk score was also associated with the PIK3CA and BRCA1 pathways. In the nomogram, the prognostic prediction of the risk score was better than that of clinicopathologic parameters in OS, including age, sex, stage, T stage, N stage, and M stage. In summary, we constructed and validated a 15-LRG signature, which may help predict the prognosis of LUAD and offer a possible direction for future research on downstream molecular mechanisms.

1. Introduction

Lung adenocarcinoma (LUAD) is the prevailing form of primary lung cancer, constituting approximately 30%–35% of cases [1]. Among patients diagnosed with LUAD, mortality primarily results from distant metastasis and cancer recurrence [2, 3], leading to a mere 15%, 5-year survival rate [4]. Consequently, the identification of innovative therapeutic targets for LUAD is of paramount importance.

A poor prognosis for cancer patients is largely the result of the complex tumor microenvironment (TME), especially the tumor immune microenvironment [5]. According to research, the TME, composed of noncancer cells, extracellular matrix (ECM), blood vessels, and lymphatics, facilitates the growth of cancer cells [6]. Immune cells in the TME promote cancer development and progression: They constitute the immunosuppressive TME and prevent tumor immune escape and carcinogenesis [7]. Chen et al. [8] found that the TME score was a significant parameter to evaluate the prognosis of LUAD patients. On the other hand, the LUAD TME exhibits extensive lymphangiogenesis, which is considered an integral contributor to lymphatic metastasis [9]. Specifically, Sasso et al. [10] found that lymphangiogenic tumors respond far better to immunotherapy than their nonlymphangiogenic counterparts [10]. It has been indicated that cancer-associated lymphangiogenesis is involved in lymph node metastasis, resulting in the unfavored survival of LUAD patients [9]. Recent studies on lymphatic vessel biology, such as advancements in intravital imaging techniques for monitoring lymphangiogenesis and lymphatic metastases, have fostered a recognition of the significant role played by the lymphatic system in the initiation, advancement, and progression of cancer [1113]. There is still some uncertainty as to whether lymphatic dissemination is mediated by cancer cell invasion of newly formed lymphatic vessels induced by tumors [14]. To elucidate the role of lymphangiogenesis in the carcinogenesis of LUAD, we used multiple public databases to detect lymphangiogenesis-related genes (LRGs) expression in LUAD. Then, we confirmed the hub 15 LRGs with prognostic significance in LUAD patients by Cox univariate regression, including FGD5, SIX1, GDF15, SOD3, PECAM1, ITGA2, LINC00857, LDHA, INHA, POSTN, GPI, PTPRM, MKI67, LOXL2, and F2RL1. Subsequently, we constructed a prognostic model by LASSO regression based on these hub LRGs for LUAD patients. Moreover, we further confirmed the association between risk score and multiple molecular pathological parameters, including immune infiltration, DNA alteration, drug sensitivity, and clinical stage.

2. Methods

2.1. Data Download

The processed LUAD original mRNA data were obtained from the TCGA database (https://portal.gdc.cancer.gov/) [15]. The GSE30219 [16] dataset, with an annotation platform of GPL570, was downloaded, resulting in the extraction of 85 samples. Additionally, the GSE37745 [17] dataset was downloaded, resulting in the extraction of 106 samples. The GSE50081 [18] was downloaded, with an annotation platform of GPL570, and a total of 127 samples were extracted.

2.2. GO and KEGG Analysis

GO and KEGG pathway analysis were based on the Metascape database (https://www.metascape.org) [19].

2.3. Model Construction and Prognosis

The lasso regression model was constructed by 22 prognostic LRGs. Specific methods can be found in our previous study [20].

2.4. Drug Susceptibility Analysis

The examination of drug sensitivity to cancer treatments was conducted using a genome database derived from the CancerRxGene, the largest database of cancer drugs available at https://www.cancerrxgene.org/ [21]. To predict the chemotherapy sensitivity of individual tumor samples, the R software package “pRRophetic” was employed.

2.5. Analysis of Immune Infiltration

The CIBERSORT algorithm [22] was employed to deconvolve the expression matrix of immune cell subtypes. Within the 547 biomarkers, there were specific markers capable of distinguishing 22 subsets of human immune cells, including T cells, B cells, plasma cells, and myeloid cells. In this study, the CIBERSORT algorithm was utilized to analyze patient data and ascertain the relative proportions of the 22 immune infiltrating cell types. Pearson correlation analysis was conducted to examine the relationship between immune cell content and gene expression.

2.6. Gene Set Variation Analysis (GSVA) and Gene Set Enrichment Analysis (GSEA)

GSVA and GSEA were performed as previously described [23, 24].

2.7. Statistical Analysis

All statistical tests conducted in this study were two-tailed, with a significance level of indicating statistical significance. The analyses were carried out utilizing the R language (Version 3.6).

3. Results

3.1. Identification of Lymphangiogenesis-Related Prognostic Genes in LUAD

The LUAD mRNA original data were obtained from the TCGA through the GeneCards database and subsequently processed. This led to the acquisition of a 507 lymphangiogenesis gene set. The differential expression of these genes was then validated using the limma package. In order to ascertain genes that exhibited differential expression, the screening criteria employed were LogFC > 1 and . The differential expression of 104 lymphangiogenesis genes was assessed (Figure 1(a)), with 48 upregulated genes and 56 downregulated genes. Cox univariate regression analysis was employed to identify prognostic genes in patients with LUAD. The analysis revealed that a total of 22 genes exhibited significant predictive value (p value < 0.05) in determining the prognosis of LUAD patients (Figure 1(b)).

3.2. Functional Enrichment for Transcriptional Network

We used pathway analysis of these 22 prognostic genes, which suggested that they were enriched in cell population proliferation, blood vessel development, cell adhesion molecule binding, and other pathways (Figure 2(a)). Using Cytoscape, PPI network analyses were conducted on the genes in the prognostic gene set (Figure 2(b)). These results suggest that 19 of the 22 prognostic genes interact and may play a major role in regulating lymphangiogenesis in LUAD.

3.3. Construction of a Prognostic Model

For the confirmation of LUAD signature genes, lasso regression was used as a feature selection algorithm. By lasso regression, patients were randomly assigned to training and validation groups in a 4 : 1 ratio (Figures 3(a) and 3(b)). Therisk score = FGD5x − 0.12780561 + SIX1x − 0.120340655 + GDF15x − 0.046094558 + SOD3x − 0.039896584 + PECAM1x − 0.010267393 + ITGA2x 0.019577509 + LINC00857 × 0.037512424 + LDHA × 0.043340545 + INHA × 0.060290303 + POSTN × 0.064482379 + GPI × 0.064968889 + PTPRM × 0.067756826 + MKI67 × 0.118176034 + LOXL2 × 0.148237808 + F2RL1 × 0.186031784. The patients were categorized into low-risk and high-risk groups based on their risk scores, and subsequently, their Kaplan–Meier curves were analyzed. In both the training and test cohorts, the overall survival (OS) of patients with a high-risk profile in LUAD was significantly shorter compared to those with a low-risk profile (training cohort: ; test cohort: ) (Figures 3(c) and 3(d)). Additionally, the receiver operating characteristic (ROC) curves for both the training and test sets demonstrated the model’s effective validation performance (Figures 4(a) and 4(b)).

3.4. Relationship between Prognostic Models and the Immune Microenvironment

The TME encompasses a diverse array of growth factors, inflammatory factors, ECM components, distinctive physical and chemical properties, cancer cells, and fibroblasts [25, 26]. These TMEs significantly impact the prognosis, diagnosis, and treatment response of patients [27]. To gain deeper insights into the influence of the risk score on the progression of LUAD, we conducted an analysis to examine the association between the risk score and tumor immune infiltration. Figure 5(a) illustrates the distribution of immune cell percentages in both the high-risk and low-risk groups. Furthermore, a comparative analysis was conducted to examine the immune cell composition in low- and high-risk groups. The findings revealed a significant decrease in follicular helper T cells, activated NK cells, monocytes, resting dendritic cells, and resting mast cells within the high-risk group. Conversely, CD4 memory-activated T cells, resting NK cells, and M0 macrophages exhibited a significant increase (Figure 5(b)). Subsequently, an additional investigation was undertaken to evaluate the correlation between the risk score and immune cell content. The findings of this study indicate a significant positive correlation between the risk score and activated memory CD4 T cells, M0 macrophages, and resting NK cells, among others. Conversely, a significant negative correlation was observed with resting mast cells, resting dendritic cells, and monocytes (Figure 5(c)). Additionally, an analysis of immune regulatory genes was conducted, revealing disparities in the expression of immune-related chemokines, immunosuppressants, immune-stimulating factors, and immune receptors (Figure 6(a)6(d)).

3.5. The Clinical Significance of the Model through Multi-Omics Research

In the context of early-stage LUAD, the efficacy of surgery and chemotherapy has been established. In our research, we employed the R package “pRRophetic” to analyze the risk score and assess the sensitivity to chemotherapy drugs, utilizing the GDSC database. Through this study, we observed a significant correlation between the risk score and the chemosensitivity of LUAD patients to AS601245, ATRA, ABT.888, MS.275, roscovitine, and salubrinal drugs (Figure 7(a)). In the subsequent phase, we conducted an investigation into the specific signaling pathways and molecular mechanisms implicated in high- and low-risk models, in order to elucidate the potential molecular mechanisms through which risk scores impact tumor progression. Based on the findings from the GSVA analysis, it was observed that the differential pathways between the two groups were predominantly enriched in E2F targets, cell cycle checkpoint, unfolded protein response, and various other signaling pathways (Figure 7(b)). The GSEA analysis further revealed that the highly expressed pathways included mismatch repair, oocyte meiosis, and ubiquitin-mediated proteolysis (Figure 7(c)). Furthermore, an investigation into the mutation profile of patients categorized as high and low risk was conducted. Notably, Figure 7(d) demonstrates a significantly higher occurrence of mutations in TP53 and other genes in the two groups. Combining the above results, we found that the high-risk group showed stronger resistance to multiple drugs, such as AS601245, ATRA, ABT.888, MS.275, roscovitine, and Salubrinal, compared with the low-risk group. The differences may be attributed to the abnormal function of various signaling pathways and molecules, as well as drug resistance caused by gene mutations.

3.6. Model Stability Verification by External Datasets

The processed data from the GEO database of LUAD patients (GSE30219, GSE37745, GSE50081) were obtained. The clinical classification of LUAD patients in the GEO database was predicted using a model, and the stability of the prediction model was assessed through Kaplan‒Meier analysis, which compared the survival differences between the two groups. Based on the GEO external validation set (Figure 8(a)8(c)), it was found that the overall survival (OS) of the high-risk group was lower compared to the low-risk group (GSE30219: ; GSE37745: ; GSE50081: ). In order to validate the accuracy of the model, we conducted an analysis on an external dataset employing ROC curves. The results exhibited a robust predictive efficacy in forecasting the prognosis of patients with LUAD, as depicted in Figure 8(d)8(f). These results suggest that our LASSO algorithm-based prognostic model shows strong sensitivity and specificity when validated by external datasets. This further proves the clinical value of this prognostic model in predicting the prognosis of LUAD patients.

3.7. Incidence Risk and Independent Prognosis Analysis

The samples were categorized into high-risk and low-risk groups based on the median risk score value, and a nomogram was constructed using regression analysis. Logistic regression analysis revealed that the risk score significantly influenced the scoring in the nomogram prediction model for all samples (Figure 9(a)).

Furthermore, prediction analysis of LUAD OS at 3 and 5 years (Figure 9(b)) yielded consistent outcomes. Additionally, univariate and multivariate analyses demonstrated that risk scores were independent prognostic factors in LUAD patients (Figures 10(a) and 10(b)). This nomogram indicates that the risk score level is superior to the clinicopathological parameters such as age, sex, and clinical stage in predicting the accuracy of survival time, and the accuracy of the risk model in predicting the 3- and 5-year survival of LUAD patients was also demonstrated in the column chart, and these results in general also demonstrated the advantage of the risk model in predicting the prognosis of LUAD patients.

3.8. Multivariate Correlation Analysis of Incidence Risk and Clinical Indicators

Clinical index values were divided into groups based on their size, and the outcomes of each group were visually displayed using a boxplot format (Figure 11(a)11(e)). The Kruskal test revealed a significant difference in the distribution of risk score values among groups for the Fustat, stage, M, N, and T clinical indicators (p value < 0.05). Utilizing modeling analysis, it was determined that LUAD samples could be accurately classified using a risk score. To further investigate this, a reverse prediction of 15 model genes was conducted using the miRcode database, resulting in the identification of 13 model miRNAs.

These miRNAs, along with 86 additional miRNAs and 486 mRNA‒miRNA relationship pairs, were visualized using Cytoscape (Figure 12). We conducted a search in the GeneCards database to identify genes associated with LUAD. Subsequently, we analyzed the expression differences of these LUAD-related genes between two groups of patients. Our findings revealed significant differences in the expression of ALK, BRCA1, KRAS, MAP2K1, MET, NRAS, PIK3CA, ROS1, TERT, and other genes in the two patient groups (Figure 13(a)).

Furthermore, we observed a significant correlation between the expression level of a model gene and several LUAD-related genes. Notably, SOD3 exhibited a significant negative correlation with PIK3CA (Pearson r = −0.24), while MKI67 showed a significant positive correlation with BRCA1 (Pearson r = 0.71) (Figure 13(b)). These results suggest that these hub LRGs, which build prognostic models, are closely related to and regulated by key LUAD genes and miRNAs.

3.9. Motif Analysis

The analysis evaluated prognostic genes in the gene set and revealed that they were controlled by several transcription factors. As a result, cumulative recovery curves were used to analyze the enrichment of these transcription factors (Figure 14(a)). As a result of the analysis, cisbp__M5876 was identified as the MOTIF annotation. Four prognostic genes exhibited enrichment in this motif, with a standardized enrichment score (NES) of 5.0. Figure 14(b)14(d) displays the motifs that demonstrated enrichment in prognostic genes, along with their corresponding transcription factors. These results suggest that the 13 hub LRGs are also regulated by a variety of transcription factors. It is further suggested that transcription factors are responsible for the changes of these hub LRGs, which also provides the direction for subsequent research.

4. Discussion

LUAD is an important malignancy in respiratory disease and in a complex disease [28]. While LUAD occurrence rates have decreased in recent years as a result of reduced cigarette use, it remains a malignancy with very poor survival rates when diagnosed at later stages [29]. There are also other factors that can adversely affect LUAD prognoses, such as late diagnosis, lack of specific biomarkers, and cancer cells’ ability to metastasize [30]. According to recent research conducted by Ren et al. [9], it has been observed that lymphangiogenesis could potentially play a crucial role in the development of metastasis and unfavorable prognosis among individuals diagnosed with LUAD. This phenomenon of lymphangiogenesis can manifest in pathological conditions such as inflammation (referred to as inflammation-associated lymphangiogenesis) and tissue repair (known as repair-associated lymphangiogenesis) [31, 32]. Similar to wound lymphangiogenesis, tumor lymphangiogenesis may occur through similar mechanisms [33]. Moreover, Pastushenko et al. [34] found that lymphangiogenesis was involved in the diagnosis, prognosis, and treatment of melanoma patients.

In our study, we first confirmed 22 LRGs in LUAD with prognostic significance, including LDHA, LOXL2, LINC00857, PTGES, F2RL1, MKI67, INHA, MMP14, GPI, POSTN, SOD3, SIX1, PECAM1, STYK1, ITGA2, CLEC14A, RAMP3, FGD5, PTPRM, LCP1, TEK, and GDF15. Further, LASSO regression suggested that F2RL1, LOXL2, MKI67, PTPRM, GPI, POSTN, INHA, LDHA, LINC00857, ITGA2, PECAM1, SOD3, GDF15, SIX1, and FGD5 were hub LRGs in the development and progression of LUAD with prognostic significance. In a previous study, F2RL1 promoted LUAD-associated angiogenesis by activating epidermal growth factor receptor signaling [35]. The promotion of LUAD development is facilitated by LOXL2, which functions as a pivotal gene in both epithelial–mesenchymal transition and copper metabolism [36, 37]. MKI67 is a significant factor in multiple molecular progression of LUAD [38, 39]. GPI has been considered a significant biomarker in LUAD patients and is related to immune infiltration [40, 41]. POSTN might be considered as an important biomarker that accounts for the angiogenic and immune infiltration mechanism of LUAD [42, 43]. INHA has been investigated as a potential biomarker for LUAD immune infiltration [44]. The independent prognostic value of LDHA for patients with LUAD was observed [45]. LINC00857 plays a key role in promoting LUAD progression by multiple molecular mechanisms, including apoptosis escape, the cell cycle, and glycolysis [46, 47]. ITGA2 had prognostic significance in the LUAD patient prognostic model based on methylation and immune biomarkers [48]. PECAM1 makes up a large portion of endothelial cell intercellular junctions, which could predict the prognosis for LUAD patients [49]. GDF15 could enhance the invasion ability of LUAD cells [50]. SIX1 could rescue the anticancer effect of miR-188 to activate the ERK pathway in LUAD cells [51]. However, PTPRM and FGD5 have not been studied in LUAD. These results indicated that these 15 hub LRGs might be involved in immune infiltration in LUAD patients.

Furthermore, we obtained a risk score based on these 15 hub LRGs, which could predict the prognosis of LUAD patients. There was significant NK cell-activated infiltration in low-risk LUAD patients. NK cell-based therapeutics show great potential alone or in combination for the treatment of LUAD [52]. Combined with our results, these LRGs may influence the progression and prognosis of LUAD by regulating the immune infiltration of NK cells. Additionally, disparities in the expression of immune-related chemokines, immunosuppressants, immune-stimulating factors, and immune receptors were observed between the high- and low-risk groups (Figure 6). These findings suggest that these differentially expressed genes may play a crucial role in modulating the extent of NK cell infiltration in patients with LUAD. Previous studies suggested that CXCL12/CXCR4 could stimulate NK cells to secrete MMP1, resulting in the induction of NK cell invasion [53]. LGALS9 functionally impairs NK cells in humans and mice [54]. CD48 could enhance NK cell cytotoxicity in T-cell lymphomas [55]. These results indicated that 15 hub LRGs might drive these immune regulators to modulate NK cell cytotoxicity, which will ultimately result in poor prognosis in LUAD patients.

Furthermore, we also found that the possible molecular mechanism between the risk groups. Our data indicated that these 15 hub LRGs induced ectopic expression of multiple signaling pathway regulators, especially PIK3CA and BRCA1. Korhonen et al. [56] indicated that the Ang2/Tie/PIK3CA pathway was required for lymphangiogenesis [56]. Coso et al. [57] found that VEGFR3 could directly interact with PIK3CA to promote lymphangiogenesis. PIK3CA could also enhance the migration and invasion ability of LUAD cells [58, 59]. Moreover, BRCA1 enhanced cisplatin chemoresistance in LUAD patients [60]. Taken together, these results indicated that these hub LRGs might also drive the PIK3CA pathway and BRCA1 axis to accelerate lymphangiogenesis and carcinogenesis in LUAD patients.

Moreover, we used multiple analysis to confirm upstream molecular regulation of these LRGs, including miRNA network, motif analysis, and correlation analysis among LRGs and LUAD-related genes. A gene closely associated with prognosis is often an excellent therapeutic target in itself. Such as the well-known MCM6, p53, AKT, and ESM1 [23, 62, 63]. With the development of molecular biology, miRNA and transcription factors have also shown high clinical therapeutic value [64, 65]. Therefore, we screened a number of LRG-related miRNAs and transcription factors, which not only partially clarified the underlying molecular mechanism of abnormal LRG expression but also provided directions for future LUAD diagnosis and treatment.

Our study also has some limitations. We used a predictive model to confirm the prognostic significance of hub LRGs in LUAD patients, which was verified by multiple databases and different independent LUAD datasets. We also analyzed the possible molecular mechanism for the risk score between the high- and low-risk clusters. However, possible molecular mechanisms need further experimental verification.

5. Conclusion

In summary, this study presents compelling evidence supporting a newly identified gene signature associated with lymphangiogenesis, which holds promise for investigating the prognosis of patients with LUAD. The 15 LRGs identified in this signature demonstrate remarkable potential as a prognostic tool for LUAD and offer valuable insights for further exploration of underlying molecular mechanisms.

Data Availability

The datasets presented in this study can be found in online repositories. The names of the repositories/repositories and accession number(s) can be found in the article.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflicts of interest.

Authors’ Contributions

J Peng, D Liu, and HF Zhang analyzed the data. QH Hu and W Chen used online tools. J Zou designed the project. J Zhang wrote the paper. H Li, AB Gao, and YK Li revised the manuscript and designed the experiment. All authors contributed to the article and approved the submitted version. Juan Peng, Dan Liu, and Hong-feng Zhang contributed equally to this work.

Acknowledgments

The present study was supported by the Natural Science Foundation of Hunan Province (Nos. 2023JJ50216 and 2022JJ50102), and the Natural Science Foundation of China (No. 82303246).