Abstract
Background. Despite increasing understanding of m6A-related lncRNAs in lung cancer, the role of m6A-related lncRNAs in the prognosis and treatment of lung squamous cell carcinoma is poorly understood to date. Thus, the current study aims to elucidate its role and build a model to predict the prognosis of LUSC patients. Materials and Methods. The data of the current study were accessed from the TCGA database. Pearson correlation analysis was performed to identify lncRNAs correlated to m6A. Next, an m6A-related lncRNAs risk model was built using a single factor, least absolute association, selection operator, and multivariate Cox regression analysis. Results. The relevance between 23 m6A genes and 14,056 lncRNAs is shown by Pearson correlation analysis by Sankey diagram. Multivariate Cox regression analysis determined that 11 m6A-lncRNAs show predictive potential in prognosis, which is confirmed by the consistency index, Kaplan–Meier analysis, principal component analysis, and ROC curve. Additionally, the immune analysis showed that the enrichment of immune cells, major histocompatibility complex molecules, and immune checkpoints in the high and low-risk subgroups were markedly disparate, with the high-risk group showing a stronger immune escape ability and a worse response to immunotherapy. Conclusion. In conclusion, the risk model based on m6A-related lncRNAs showed great promise in predicting the prognosis and the efficacy of immunotherapy.
1. Introduction
Lung cancer has long been the most fatal and the second most common malignancy globally [1]. LUSC accounts for 35% of nonsmall cell lung cancer (NSCLC) cases and shows unique epidemiological, clinicopathological, and molecular characteristics. For instance, it is closely related to smoking, low EGFR mutation rate, and low ALK rearrangement rate, leading to a poor targeted therapy outcome [2]. However, in recent years, as tumor immunotherapy strategies continue to improve, it has been reported that immunotherapy could be effective in LUSC, regardless of the PD-L1 expression and TMB levels [3–7]. At present, with the advent of more and more antitumor drugs, methods to improve the effect of antitumor drugs, especially immunotherapy, have also gratifying results, such as using nanotechnology as a carrier [8–12]. Therefore, identifying biomarkers that could accurately predict patient prognosis and efficacy of immunotherapy is urgently needed.
In eukaryotic cells, N6-methyladenosine (m6A), which participates in RNA biogenesis and function, is the most abundant RNA modification. Importantly, it mediates the modification of noncoding RNA (ncRNA) through various biological components [13]. At the same time, noncoding RNAs can reversely affect tumor progression and metastasis by regulating the m6A modification of mRNAs [13]. For example, FOXM1-ASlncRNA transcribed from the antisense strand of the FOXM1 gene can promote the process of tumorigenesis of GSCs by promoting ALKBH5 (an m6A erasure element) to remove m6A [14]. In addition, lncRNA targeting the reduction of GATA3 expression by inducing pre-mRNAKIAA1429-mediated m6A modification is needed for liver cancer cell development [15]. LncRNA Gas5-AS1 interacting with Gas5 promotes the ALKBH5-mediated m6A demethylation process, resulting in the expression of the tumor suppressor GAS5 and thus impeding the division and invasion of cervical cancer cells [16]. Taken together, m6A and lncRNA are highly correlated and might affect tumor growth and metastasis through their interaction. Additionally, m6A modification is a kind of epigenetic behavior tightly related to lung cancer and the m6A regulatory factor gene has a significant value in predicting prognosis for LUSC [17, 18]. Specifically, lncRNA-ATB might influence LUSC progression by controlling the microRNA-590-5p/NF-90 axis [19]. Similarly, some lncRNAs are underlying factors for predicting the prognosis in LUSC [20]. Nevertheless, the role of m6A-lncRNA in LUSC remains elusive.
Therefore, our study aims to explore whether m6A-lncRNAs could play an important predictive role in LUSC and to find potential markers of immunotherapy through immune-related analysis for screening large patient populations. In addition, we have also identified candidate drugs related to immunotherapy with significant differences in IC50 under this model.
2. Materials and Methods
2.1. Transcriptome and Clinical Data Acquisition
VarScan software was used to obtain the clinical information (gender, age, TNM stage, survival status, and survival time) and gene expression profile data of 505 patients with LUSC from the TCGA database.
2.2. Screening of m6A-Related lncRNAs
The expression matrix of 23 m6A-related genes was screened from the TCGA database [21]. Pearson pertinence analysis was applied to identify lncRNAs of interest, and a total of 2350 lncRNAs related to m6A were subsequently selected (|Pearson R| > 0.4 and ).
2.3. Establishment of the Risk Model
The entire clinical dataset extracted from TCGA was stochastically split into two groups (training subgroup and testing subgroup). The baseline characteristics (gender, age, stage, and TNM stage) of the two subgroups showed no significant differences (). A risk model was then constructed using the training subgroup and verified using the testing and entire subgroups.
To classify the risk level, we used 11 m6A-related lncRNAs that adequately made contact with OS to score the risk of patients of the training set. Screening of m6A-related lncRNAs involved univariate Cox regression analysis as well as LASSO-penalized Cox analysis (using R language package GLMNET) and multivariate Cox ratio hazard regression analysis [22–24]. Risk score = Expr (lncRNA1) × Coef (lncRNA1) + ...... + Expr (lncRNAn) × Coef (lncRNAn) [23].
2.4. Accuracy Testing of This Model in Predicting Prognosis
We test the accuracy of the model by drawing C-index and receiver operating characteristic curve with the R package “timeROC.” [22] 1, 3, and 5 years OS was predicted with the scores acquired by scoring factors that affected prognosis (age, gender, stage, TNM stage, and risk score). We used the R language package “regplot” to draw the alignment diagram [25].
2.5. Independence Test of the Risk Model
To examine whether risk scores could be used as prognosis predictors like other clinical characteristics, univariate Cox and multivariate Cox analyses of the entire set of samples were performed [26].
2.6. PCA Analysis
PCA analysis was performed on the entire gene expression profile, 23 m6A-related genes, 2350 m6A-related lncRNAs, and the risk model to identify the sample difference and reduce high-dimensional data. The R language packages “scatterplot3d” and “limma” were used, respectively [27].
2.7. Immune Function Analysis
First, the gene expression discrepancy between the high-risk group and the low-risk group in the entire dataset with the help of the R language package “limma” was analyzed. Next, the clustering condition of genes that expressed discrepantly was observed by conducting GO analysis using the R package “clusterProfiler” to detect enrichment in different biological processes. [21] The critical value was 0.05. A value less than the threshold revealed which GO terms were markedly clustered [26, 28].
2.8. Immunotherapy and Potential Drug Screening Analysis
The tumor immune dysfunction and rejection score (TIDE) is a calculation framework designed by Peng Jianget al. to integrate different tumor escape mechanisms. The effective samples and TIDE scores were obtained from https://tide.dfci.harvard.edu/ [22]. To explore the potential of therapeutic drugs, the IC50 of the compound obtained from the GDSC website in LUSC patients was predicted using the R language package “pRophetic” [21].
3. Results
3.1. Extracting m6A-Related lncRNA from LUSC Patients
We extracted 23 m6A genes and 14056 lncRNAs. LncRNAs significantly related to one or more 23 m6A genes were termed m6A-related lncRNAs. 2350 lncRNAs related to m6A were obtained, and a Sankey diagram was drawn to observe the potential association between m6A genes and lncRNAs (Figure 1(a)).

(a)

(b)

(c)
First, univariate Cox regression analysis was exerted to screen m6A-related lncRNAs that have a significant correlation with overall survival (Figure S1). Lasso-Cox regression analysis was then applied to accurately and effectively identify predictive markers based on the LASSO-penalized regression model to identify lncRNAs related to overall survival according to the smallest lambda value. The selected m6A-related lncRNAs (n = 16) were incorporated into multivariate regression analysis (Figures 1(b) and 1(c)).
Finally, 11 m6A-lncRNAs independently related to OS were used to construct the risk models (Table 1). Figure 2(a) illustrates the correlation between the m6A genes and lncRNAs used for model construction.

(a)

(b)

(c)

(d)
To assess the potential prognostic value of these m6A-related lncRNAs, 495 patients obtained from the TCGA database were stochastically separated into training and testing groups (Table 2). The training group was used to establish the model and predict its accuracy, while the validation group and the whole dataset were used to verify the model. Based on the median value of the risk score of each group, the LUSC samples were split into high and low-risk subgroups, and K-M survival curves were drawn (Figures 2(b)–2(d)). Next, we analyzed the distribution of risk levels in each group and displayed the status of patients in each subgroup via a dot chart. Finally, a heatmap was generated to visualize the expression patterns of the 11 lncRNAs in two subgroups (Figures 3(a)–3(i)).

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)
The results of the training set analysis suggested that patients in the high-risk group had lower overall survival rates than the low-risk group (). Similar results were obtained when the validation and whole datasets were analyzed. Accordingly, our model has good potential for prognosis prediction.
3.2. Test the Accuracy of the Risk Model in Predicting the Prognosis and Classification
The results from the whole dataset further underwent univariate and multivariate Cox regression analyses, revealing that the risk score is an independent prognostic factor (Figures 4(a) and 4(b)). To accurately illustrate the universality and importance of the risk model in forecasting prognosis, the concordance index (C-index) of risk scores and AUC were evaluated. It was found that the AUC of the risk score was higher compared to clinical characteristics such as age, gender, and tumor stage, with the C-index further indicating good consistency between predicted and actual observations (Figures 4(c) and 4(d)). Nomograms and calibration curves based on age, gender, TNM, and risk score were also drawn to predict the OS of patients at 1, 3, and 5 years, which validated the high accuracy and authenticity of our model (Figures 4(e) and 4(f)).

(a)

(b)

(c)

(d)

(e)

(f)
To assess whether our model is appropriate for patients with different clinicopathological characteristics, the difference between high and low-risk subgroups of OS was analyzed by stratifying according to different clinicopathological characteristics. The results that applying K-M analysis to analyze the entire group of samples based on three clinicopathological characteristics of age, gender, and stage demonstrated that OS in high-risk group was lower, compared to the low-risk subgroup (Figures 5(a)–5(f)). Moreover, similar results were observed after stratifying by TNM staging and tumor mutation burden (Figures S2A and 5(h)).

(a)

(b)

(c)

(d)

(e)

(f)
PCA analysis was carried out on the whole gene expression profile, 23 m6A-related genes, 2350 m6A-related lncRNAs, and 11 m6A-related lncRNAs to verify that our model was superior to other models and to assess its ability to distinguish patients with different risk levels. It was found that the degree of distinction between the two subgroups was higher in our model compared with the other three models, which enabled better differentiation between high and low-risk subgroups (Figures 6(a)–6(d)).

(a)

(b)

(c)

(d)
98 differential genes were screened by comparing the differences of genes between the high and low-risk subgroups in the whole dataset to identify the potential molecular mechanism of the m6A-based model. GO enrichment analysis indicated the biological processes were enriched in immunity (Figure 7(a)). Next, the immune enrichment results of immune cells, immune pathways, major histocompatibility complex molecules, chemokine receptors, and immune checkpoints indicated that the immune system of the high-risk subgroup was more active (Figure 7(b)). 20 driver genes with the most frequent alteration between the two subgroups were identified (Figures 7(c) and 7(d)). Furthermore, the TMB scores calculated from the TGCA dataset showed no significant differences () (Figure 7(e)).

(a)

(b)

(c)

(d)

(e)

(f)
The TIDE scores of all cases are based on the expression levels of immunotherapy biomarkers such as IFNG, MSI, Merck18, CD274, CD8, CTL, MDSC, CAF, and TAM-M20, suggesting that immune escape function in the high-risk group is stronger and a worse response to immunotherapy. This finding suggests that our model can classify patients by predicting their response to immunotherapy (Figure 7(f)).
Finally, 10 out of 78 compounds, with the most significant difference in drug half-maximal inhibitory concentration between high and low-risk groups, were screened to identify possible therapeutic drugs with our model (Figures S3A and 7(j)), which provides the basis for follow-up studies on therapeutic drugs for LUSC [29].
4. Discussion
Poor understanding of driver genes in LUSC accounts for the limited number of treatment strategies for this patient population [30, 31]. Hence, accurate prediction of the prognosis of the patients with LUSC is necessary, emphasizing the need to identify biomarkers for guiding treatment. It has been shown that m6A modifications and lncRNAs influence the occurrence and development of LUSC [32, 33].
In the present study, a risk model that works on predicting the prognosis in LUSC was established, and the relationship between our model and immune response was explored. An increasing body of evidence suggests that m6A-related lncRNAs are tightly related to antitumor immunity and immune infiltration [34, 35].
11 out of 2350 m6A-related lncRNAs that correlated with OS were screened. Among these, AP001189.3 has been reported to be related to MAPK and other signaling pathways and could reportedly predict the prognosis of colon cancer [36]. L3MBTL2 promotes the recruitment of the ubiquitin ligase RNF168 to DNA lesions and promotes the repair process. Meanwhile, L3MBTL2 can also serve as a key target of the ubiquitin ligase RNF8 after DNA damage [37]. However, L3MBTL2 has rarely been reported as a prognostic factor. Our research can provide a new direction for future generations to further study the function of this gene. Furthermore, GRHL3-AS1 was reported to have a prognostic function in primary head and neck squamous cell carcinoma [38]. Nonetheless, to the best of our knowledge, the predictive effect and biological function of the remaining 8 lncRNAs (AC008734.1, AL157838.1, AC010422.4, AP001347.1, AL731577.2, AC254562.3, DSCR9, and LINC02332) have not been reported in the literature. Indeed, the present study results provide the basis for future studies on the molecular biological function of these m6A-related lncRNAs in the occurrence and progression of LUSC.
The 495 cases extracted from TCGA were split into two groups based on the median risk score. Importantly, we found that the OS was lower in the high-risk group than in the low-risk one. Additionally, when stratified by gender, stage, TNM stage, and TMB, the OS in the high-risk group was still poorer, compared to the low-risk group. Therefore, our risk model consisting of 11 m6A-related lncRNAs correlated with OS yielded accurate results and provided the basis for subsequent research on potential biomarkers for LUSC treatment. Moreover, the analysis of GO enrichment indicated that the immune system in the high-risk group was more active, suggesting the potential relationship between our model and immune response. Moreover, this finding substantiated the association between the poor clinical outcome of the high-risk subgroup and the induction of highly expressed immune molecules. The TIDE score suggested that the high-risk subgroup reacted worse to immunotherapy because of a stronger immune escape function, compared to the low-risk subgroup, which could be attributed to a stronger immune escape ability. It is noteworthy that the expression of immune-related molecules in the low-risk subgroup was relatively lower. The above results indicate that immunotherapy can be effective, even in patients that express fewer immune checkpoint molecules such as PD-L1, compared with those with high expression of PD-L1, consistent with the literature [11, 31, 39]. Our results can also explain the phenomenon whereby some PD-L1 < 1% patients exhibit a better response to immunotherapy in contrast to PD-L1 > 1% patients [12]. The present study results are expected to offer novel insights into the biological function of m6A-related lncRNAs in LUSC. It has been shown that titin (TTN) is expressed in both groups, except TP53. Previous studies have shown that TTN can be alone considered a factor predicting the prognosis of LUSC and the efficacy of the treatment with immune checkpoint inhibitors.
Herein, univariate, LASSO, and multivariate Cox regression analyses were initially made use of selecting m6A-related lncRNAs with predictive function. Moreover, C-index, AUC, and KM analysis confirmed the powerful prediction ability of the risk model to predict prognosis.
However, this study also has certain limitations. Indeed, our model was established after multiple screenings, and the sample size was limited; more external experimental verification is needed to substantiate that m6A-related lncRNAs are efficient predictive biomarkers. What is more, the interaction between these prognostic lncRNAs and m6A regulatory factors in LUSC was not explored, and it remains unclear how m6A-lncRNAs can affect tumor immune response, warranting the need for further studies.
In conclusion, we established a model playing a prognostic role, made of 11 m6A-related lncRNAs from TCGA related to the tumor immune response, providing new directions for the prediction of patient prognosis in LUSC. Importantly, our model may help screen patients with good responses to immunotherapy and even clarify the biological processes of m6A-related lncRNAs in LUSC.
Data Availability
TCGA belongs to public databases. Users can download relevant data for free for research and publish relevant articles.
Ethical Approval
The patients involved in the database have obtained ethical approval. Our study is based on open-source data, so there are no ethical issues.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Authors’ Contributions
H. M. and H. S. conceived the study and its design. Y. Z. and X. G. were involved in the data analyses and wrote the manuscript. S. W. and X. G. contributed to the discussion. H. M., H. S., and Y. Z. reviewed, edited, and finalized the manuscript. All authors contributed to the article and approved the submitted version.
Acknowledgments
The authors acknowledge the TCGA database for providing their platforms and contributors for uploading their meaningful datasets. This study was supported by grants from the National Natural Science Foundation of China (81872308 and 82072719) and the Natural Science Foundation of the Guangdong Province (2021A15150100790).
Supplementary Materials
Figure S1: Univariate ratio risk Cox regression screened out 21 m6A-related lncRNAs. Figure S2: Kaplan–Meier survival analysis based on TNM and TMB, between the high-risk and low-risk groups in the entire set. Figure S3: 10 candidate compounds targeting the m6A-related lncRNA prognosis model. (Supplementary Materials)