Abstract
Background. Despite the constant iteration of small-molecule inhibitors and immune checkpoint inhibitors, PRAD (prostate adenocarcinoma) patients with distant metastases and biochemical recurrence maintain a poor survival outcome along with an increasing morbidity in recent years. N7-Methylguanine, a new-found type of RNA modification, has demonstrated an essential role in tumor progression but has hardly been studied for its effect on prostate carcinoma. The current study aimed to seek m7G (N7-methylguanosine) related prognostic biomarkers and potential targets for PRAD treatment. Methods. 42 genes related to m7G were collected from former literatures and GSEA (Gene Set Enrichment Analysis) website. Then, RNA-seq (RNA sequencing) and clinical data from TCGA-PRAD (The Cancer Genome Atlas-Prostate) cohort were retrieved to screen the differentially expressed m7G genes to further construct a multivariate Cox prognostic model for PRAD. Next, GSE116918, a prostate cancer cohort acquired from GEO (Gene Expression Omnibus) database, was analyzed for the external validation group to assess the ability to predict BFFS (biochemical failure-free survival) of our m7G prognostic signature. Kaplan-Meier, ROC (receiver operator characteristic), AUC (areas under ROC curve), and calibration curves were adopted to display the performance of this prognostic signature. In addition, immune infiltration analysis was implemented to evaluate the effect of these m7G genes on immunoinfiltrating cells. Correlation with drug susceptibility of the m7G signature was also analyzed by matching drug information in CellMiner database. Results. The m7G-related prognostic signature, including three genes (EIF3D, EIF4A1, LARP1) illustrated superior prognostic ability for PRAD in both training and validation cohorts. The 5-year AUC were 0.768 for TCGA-PRAD and 0.608 for GSE116918. It can well distinguish patients into different risk groups of biochemical recurrence ( =1e-04 for TCGA-PRAD and =0.0186 for GSE116918). Immune infiltration analysis suggested potential regulation of m7G genes on neutrophils and dendritic cells in PRAD. Conclusions. A m7G-related prognostic signature was constructed and validated in the current study, giving new sights of m7G methylation in predicting the prognostic and improving the treatment of PRAD.
1. Introduction
Reportedly, the morbidity of PRAD in men has grown to the second highest, being the sixth leading cause of death for men [1]. About 93% PRAD patients were discovered in the local stages whose five-year OS (overall survival) rate near to 100% [2]. In contrast, the OS was less than 30% in patients accompanied with distant metastases [3]. At present, PSA (prostate-specific antigen) is the best first-step serum marker for the screening of prostate tumor, and it remains the most commonly used tumor marker [4]. Although PSA has achieved remarkable results in the early detection of prostate cancer, there is no consensus on whether PSA can effectively reduce the death risk for PRAD patients [5]. Currently, the treatment for PRAD, especially for the advanced PRAD, was still conventional radiotherapy and chemotherapy [6]. But for patients with advanced PRAD, the comprehensive efficacy was still not ideal. Therefore, what is urgently necessary is to find novel biomarkers for the diagnosis, prognosis, and treatment of PRAD to improve the survival outcome of patients in terminal stage.
It is known that m7G, one type of RNA modification, is significantly associated with various biological processes [7] and is widely found in eukaryotes and prokaryotes [8]. It is widely distributed in the 5 cap region of tRNA [9], rRNA [10], and eukaryotic mRNA [11] to modulate RNA processing [12, 13], elongation [14], splicing [15], nucleation, and protein translation. M7G has been reported related with the occurrence of primordial dwarfism in human [8]. The level of m7G was significantly reduced in the primordial dwarfism patients with a missense mutation in gene encoding WDR4 [16]. Additionally, the m7G level was also related with tumor cell chemoresistance [16], cell cycle, and oncogenicity [17]. Decreased m7G tRNA modification could induce oncogenicity of lung cancer by decreasing cell proliferation, colony formation, and cell invasion [18]. Although m7G has had certain research results, the roles of the m7G-related genes and corresponding RNA modification process in PRAD remain unknown. In addition, it is unclear for the prognostic power of m7G-related genes in PRAD. The current study aimed to seek m7G-related prognostic biomarkers for terminal-stage PRAD patients, providing potential drug targets for advanced prostate cancer patients’ treatment.
2. Material and Methods
2.1. Differentially Expressed N7-Methyladenosine-Related Genes
The RNA-seq data of 551 samples, including 52 normal samples and 499 PRAD samples, was downloaded in TCGA database [19] in the form of row counts. The clinical data, such as age, sex, tumor grade, Gleason score, and PSA, was also acquired from TCGA. Similar information about the external validation cohort (GSE116918) was acquired from GEO database. The m7G-related gene sets, “GOMF M7G 5 PPPN DIPHOSPHATASE ACTIVITY,” “GOMF RNA 7 METHYLGUANOSINE CAP BINDING,” and “GOMF RNA CAP BINDING” contain 26 genes which were related with m7G, were downloaded from the GSEA [20]. Even more, according to previous reports, we finally found other 16 m7G-related genes which were illustrated in supplementary materials (available here). The “limma” [21], a kind of R packages, was used to recognize m7G-related differentially expressed genes (mDEGs), with . To explore the potential regulating relationship, STRING (Search Tool for the Retrieval of Interacting Genes) [22] was employed to seek possible regulating relationship among these mDEGs in version 11.5.
2.2. Analysis and Validation of Our Prognostic Signature
These mDEGs were subsequently subjected to estimate their prognostic power by univariate Cox regression with . Then, lasso (least absolute shrinkage and selection operator) [23] Cox regression analysis was adopted for this prognostic signature construction by using R package “glmnet.” Finally, three mDEGs were included in the prognostic signature after lasso penalty. The proportional hazard was counted as the formula: risk score = (: coefficients, : gene expression). The forest plot was drawn by “survminer” package to show hazard ratio and confidence interval of genes. Patients were stratified into low- and high-risk clusters based on optimal cut-off; their survival difference was displayed in KM (Kaplan-Meier) curves. PCA (principal component analysis) was conducted to demonstrate the distance of subgroups by using R package stats. R packages, including “survminer” and “timeROC,” were also employed to compute ROC curves of 1, 3, 5 years for the prognostic signature. Same analytic strategy was also conducted in the external validation cohort GSE116918.
2.3. The Prognostic Value of This m7G-Related Signature
Univariate and multivariate Cox regression analyses were employed to assess prognostic value for this signature. HR of this signature and traditional clinical factors for prognostic, including Gleason score and tumor grade, were displayed in the forest map and heat map. Same strategy was also carried out in the external validation cohort GSE116918. According to multivariate Cox regression analysis’ results, a nomogram combing the m7G-related signature and clinical factors was depicted to provide easy access to patients’ prognostic risk.
The calibration curve was also demonstrated to verify the integrity of the above nomogram.
2.4. The Expression Validation of 3 Genes in m7G Signature
To understand the expression of these genes in PRAD patients with different stages, the clinical correlation analysis was utilized between PRAD clinical characteristics and the level of risk score, as well as expression level of these genes. In addition, immunohistochemical images of EIF4A1, EIF3D, and LARP1 in normal tissues and PRAD tissues for this study were downloaded from the Human Protein Atlas [24], which were used for further understanding their expression situation in proteins.
2.5. Downstream Regulatory Pathways and Infiltrating Immune Cells of the m7G-Related Signature
To determine the downstream regulatory pathway of genes in the m7G-related signature, differential analysis was accomplished between different risk subsets where DEGs were selected when its and .
Then, KEGG (Kyoto Encyclopedia of Genes and Genomes) and GO (Gene Ontology) analyses were conducted with R package “clusterProfiler” [25] for these DEGs. GSVA (Gene Set Variation Analysis) [26] was utilized to analyze the relationship of some immune cells and immune pathways by R package “GSVA.” Furthermore, correlation analysis was performed between immune cells and immune pathways to computing their potential regulating relationship.
2.6. Drug Sensitivity Analysis for the 3 Genes in m7G Signature
In the data of the US NCI (National Cancer Institute) 60, a total of 60 cancer cell lines have been derived from nine different cancers [27], which was acquired from CellMiner database. Pearson’s correlation analysis was utilized to assess the relationship between drug response and gene expression levels related to m7G.
2.7. Statistic and Software
The data was processed and analyzed using R 4.41 (package: limma, ggplot2, survminer, timeROC, GSVA, and so on). Statistical correlations of parametric and nonparametric variables were analyzed using the Pearson and Spearman correlations. We considered all data analyses to be statistically significant with .
3. Results
3.1. Differential Analysis between Normal and Tumor Prostate Tissues
Differential analysis of the 42 m7G-related genes was conducted between PRAD tissues and normal prostate tissues in TCGA, and 16 DEGs were identified with . As illustrated in Figure 1(a), a heat map showed the expression levels for these mRNAs related to m7G. To discover the possible interactions between m7G-related genes, protein-protein interactions analysis was conducted (Figure 1(b)). For the interaction analysis, a minimum interaction score of 0.9 was required, and the result shows that NCBP1, NCBP2, AGO2, EIF4A1, EIF43D, EIF4G3, and EIF4E2 were hub genes. The correlation network which contained 42 m7G-related genes is demonstrated in Figure 1(c).

(a)

(b)

(c)
3.2. A m7G-Related Gene Prognostic Model in TCGA
After removing the sample with missing clinical data, the 339 samples were used for further analysis. Firstly, the primary screening of genes associated with BCR (biochemical recurrence) was estimated by univariate Cox regression analysis. To keep prognosis value of our prognostic signature, the cut-off -value for the study was set at 0.05, and 3 genes (EIF3D, EIF4A1, LARP1) were obtained (Figure 2(a)). Then, the lasso Cox regression analysis was carried out for further screening, then a 3-gene signature was acquired with the ideal value (Figures 2(b) and 2(c)). The multivariable Cox regression was employed in the 3 genes, and the result shows that EIF3D (), EIF4A1 (), LARP1 () could be well prognostic genes (Figure 2(d)). Based on the following calculation, risk score was computed: . In accordance with the optimum standard, 339 patients were separated: 226 patients with low risk and 113 patients with high risk. Then, the result of PCA illustrated that PRAD patients from TCGA with varying risk scores were well separated into two different groups (Figure 2(e)). A higher number of BCR samples were collected from patients with higher risk scores; meanwhile, a shorter BFFS time was recorded than that with lower risk scores (Figures 2(f) and 2(g)). With , significant differences were found between the BFFS times of the two groups (Figure 2(h)). The sensitivity and specificity of this m7G gene prognostic signature were evaluated by ROC analysis. The results of ROC illustrated that AUC was about 0.768 for 1 year, 0.666 for 3 years, and 0.681 for 5 years BCR (Figure 2(i)).

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)
3.3. External Validation of This Prognostic Signature
The data of extra RNA-seq and clinical characteristics of 248 PRAD patients were downloaded in a GEO cohort (GSE116918) for external validation. The multivariable Cox regression of 3 genes was also conducted in GEO cohort, and the result shows EIF3D (), EIF4A1 (), LARP1 () (Figure 3(a)). The GEO cohort was reclassified as either low or high risk based on the sores in the TCGA cohort, with 165 patients classified as low risk, and the other 83 patients as high risk. The PCA analysis illustrated significant separation of the two different risk groups (Figures 3(b) and 3(c)). Patients with lower risk scores were found to have longer BFFS time and lower BCR rate (Figure 3(d)). According to the external validation model, BFFS time also significantly associated with risk score with (Figure 3(e)). In GEO cohort, ROC curve indicated that our prognostic signature had significant prognostic value with AUC of 0.920 for 1 year, 0.588 for 3 years, and 0.610 for 5 years depending on the BCR (Figure 3(f)).

(a)

(b)

(c)

(d)

(e)

(f)
3.4. Evaluation of Prognostic Value of m7G-Related Gene Signature
Table 1 summarizes clinical features of PRAD patients (339 patients from TCGA and 248 from CEO cohorts). Univariate and multivariate Cox regression analyses were employed in order to assess whether this prognostic signature could well predict the BFFS independently. As determined by outcomes of univariate Cox regression, the signature could be a predictor of survival of PRAD patients in TCGA () and GEO () database (Figures 4(a) and 4(c)). Using the multivariable Cox regression analysis, we found this risk signature could be a well independent predictor of BCR for PRAD patients, independently of age, gender, and other confounding factors in TCGA (, Figure 4(b)) and GEO (, Figure 4(d)) cohorts. Based on clusters divided by the risk scores, we created a heat map to illustrate the differences in the distribution of clinical characteristics between different risk groups’ patients in TCGA (Figure 4(e)). Figure 5(a) shows the situation that the prognostic nomogram for BFFS from the TCGA cohort included all significant clinical findings. In addition, the calibration curve for the probability of recurrence of biochemical abnormalities in PRAD patients at 1, 3, or 5 years showed optimal agreement with predictions derived by the nomogram (Figure 5(b)). On the blue curve, the prediction model accuracy for survival over one year was represented. Likewise, the violet curve represented survival over three years and the red curve represented survival over five years.

(a)

(b)

(c)

(d)

(e)

(a)

(b)
3.5. Expression Validation of the Risk Model in Prostate Cancer Patient
The correlation analysis was employed between the clinical characteristics and the expression of EIF3D, EIF4A1, and LARP1 in TCGA cohort, the risk score as well. As shown in Figures 6(a)–6(c), T classification and lymph node involvement were significantly associated with EIF3D expression; a significant association was also identified between LRP1 expression and T classification, lymph node involvement, and Gleason score (). Furthermore, risk scores were also notably correlated to T-staging in pathology, N-staging in pathology, and Gleason score for progression of PRAD (Figures 6(d)–6(f)). In accordance with the Human Protein Atlas database, we discovered that EIF3D, EIF4A1, and LARP1 were all upregulated in PRAD tissues (Figure 6(g)).

(a)

(b)

(c)

(d)

(e)

(f)

(g)
3.6. Functional Enrichment and Immune Activity Analysis
Differential analysis was conducted by “limma” R package to extract genes which were differentially expressed between different risk groups to explore signaling functions and pathways. 2219 DEGs were identified in TCGA cohort under the criteria: , as well as , including 19 downregulated genes and 2200 upregulated genes in the high-risk cluster. Through these DEGs, enrichment analysis of KEGG and GO indicated that these genes were closely correlated with humoral immune response regulated by circulating immunoglobulin, immunoglobulin complex, immunoglobulin receptor binding, and neuroactive ligand-receptor interaction (Figures 7(a) and 7(b)).

(a)

(b)
ssGSEA analysis demonstrated a correlation between TCGA and GEO for 16 immune cell infiltration values and 13 immune-related pathway activities. Results from TCGA cohort analysis illustrated that patients with higher risk scores had less immune cell infiltration compared with low-risk participants, such as aDCs, neutrophils, and Th1 cells (Figure 8(a)). In immune-related pathways activity analysis, MHC (major histocompatibility complex) class I, parainflammation, and type I IFN (interferon) response showed the lower activity in high expression group (Figure 8(b)). Additionally, in GEO cohort, we found that DCs (dendritic cells), mast cells, pDCs (plasmacytoid DCs), and Th (T helper) cells had significant difference in subgroups (Figure 8(c)). Immune-related pathways, such as APC (antigen-presenting cell) co-stimulation, check point, and T cell co-stimulation, also had significant differences (Figure 8(d)). Then, the correlation of infiltrating immune subsets between different immune cells and pathways related to immune is demonstrated in Figures 9(a) and 9(b). The closer the correlation coefficient was to 1, the higher the correlation between immune cells or immune-related pathways. A heat map illustrated that T helper cells, HLA (human leukocyte antigen), and parainflammation were significantly associated with PRAD samples.

(a)

(b)

(c)

(d)

(a)

(b)

(c)
3.7. Expression of Genes with Sensitivity of Cancer Cells to Chemotherapy
Pearson’s correlation analysis was conducted to assess whether 3 genes within NCI-60 cell lines have a relationship with drug sensitivity and there was a -value < 0.01. The EIF3D, EIF4A1, and LARP1 genes have all been implicated in chemotherapeutic drug sensitivity, such as hydroxyurea, chelerythrine, methylprednisolone, cladribine, and cytarabine (Figure 10). EIF3D was identified to enhance the sensitivity of cancer cells to some chemotherapeutics, such as hydroxyurea, chelerythrine, and vorinostat, while LARP1 weakened the sensitivity of denileukin diftitox ontak.

4. Discussion
In our study, the differentially expressed m7G-related genes were analyzed between normal prostate tissues and prostate cancer tissues in TCGA, and 16 m7G-related genes were identified as mDEGs. Univariate and lasso Cox regression analyses were then utilized to develop this 3-gene risk signature that assessed the prognostic relevance of genes which were significantly connected with m7G. Taking outcomes of our study into account, we believe that this signature could be regarded as a prognostic factor to predict the BFFS for PRAD patients independently. Then, the function enrichment analysis presents that this prognostic signature was observably related with immune cells and immune-related pathways. Additionally, analysis of drug sensitivity indicated that EIF3D, EIF4A1, and LARP1 expressed by cancer cells greatly influenced the sensitivity of the cells to chemotherapy.
The mRNA cap m7G is reported as a ubiquitous positively charged modification [7] and can modulate nearly all stages of mRNA synthesis. However, how m7G-related genes related with the occurrence and progression of PRAD, as well as the survival time of PRAD patients, was still uncertain. Based on 3 genes associated with the m7G sequence, this study developed a signature, including EIF3D, EIF4A1, and LARP1. Later, we found the signature could predict the BFFS of PRAD patients. EIF3 is the largest of the ribosomal EIF complexes that bind to the 40S ribosome and maintains dissociation of the 40S and 60S ribosomal subunits. According to some previous studies, EIF3D, the largest subunit of EIF3, regulates the stability of the association between EIF3 subunits [28]. EIF3D has also been reported to play some important role in tumors, such as colon cancer [29], melanoma [30] and breast cancer [31]. Recent studies concluded that the EIF3D expression is critical to the occurrence of cancer by promoting protein synthesis. Overexpression and activity of EIF4A1, which is an ATP-dependent RNA helicase, have been linked to the development of certain tumors [32]. In ATP-dependent protein translation, EIF4A1 catalyzes unwinding of mRNA’s 5 UTR, an essential step in RNA processing, prior to ribosome binding, especially for mRNAs with increased lengths and complexity of the 5 UTR. Previous studies over the past decade had proved the antitumor toxicity of EIF4A1 inhibition, maintaining a therapeutic window for normal cells [32, 33]. LARP1 had also been reported to have significant correlation with some tumor, such as colorectal cancer [34] and ovarian cancer [35]. What has been illustrated is that LRP1 complexes with PABP (poly-A binding protein) and binds an interactome of over 3000 mRNAs, among them transcripts encoding the ribosomal machinery, which lack the 5 TOP motif [36]. Furthermore, LARP1 had been shown to contribute significantly to ribosome biogenesis through its involvement in mTOR signaling [37]. A m7G group or derivative at the 5 end of an RNA molecule showed selective and noncovalent interactions with EIF3D, EIF4A1, and LARP1, according to the gene sets “GOMF RNA CAP BINDING.” In summary, EIF3D, EIF4A1, and LARP1 in the prognostic model were proved to be m7G-related genes. Our study showed that not every gene we examined was related to a better PRAD prognosis. Thus, the specific mechanism of these genes interacting with each other during m7G remains to be further explored.
Up to now, the research on mechanism of m7G-related genes in PRAD is still insufficient. Our study identified that EIF3D, EIF4A1, and LARP1 may be key m7G-related genes in PRAD. In our analyses of prognostic value of these genes, we identified a signature that may function as a prognostic factor independently. Moreover, immunoactivity analysis and function enrichment were also employed to investigate the possible functions of this signature. However, the specific mechanism of m7G-related genes, including EIF3D, EIF4A1, and LARP1, affecting the occurrence, development, and prognosis of PRAD deserves further research.
In conclusion, the study presented that m7G was closely related with PRAD, because for the expression of most of the genes related to m7G, there was a great difference between tumor and normal prostate tissues. Aside from that, the score calculated from the 3 m7G-related genes signature had significant correlation with the prognosis of PRAD patient. Also, the model may be a method for predicting BFFS independently in TCGA cohort and GEO cohort for PRAD patients. On the other hand, immunity was associated with DEGs which was identified from different risk clusters. In our study, a novel prognostic signature was identified to predict the BFFS for PRAD patients, and the genes EIF3D, EIF4A1, and LARP1, related to m7G, were identified as the genes representing m7G-related genes in PRAD for further mechanisms research. In addition, the study provides a well condition to investigate the regulating relationship between m7G and immunity in PRAD.
5. Conclusion
In our study, multiple bioinformatics analyses were utilized to identify a 3-gene signature related with m7G to evaluate the prognosis of PRAD patient. Our study found that the signature was significantly related with the prostate cancer clinical characteristics and immunity. Thus, the gene signature in our study could be utilized as an independent prognostic indicator for PRAD patients.
Abbreviations
| m7G: | N7-methylguanine | 
| PRAD: | Prostate adenocarcinoma | 
| BCR: | Biochemical recurrence | 
| BFFS: | Biochemical failure-free survival | 
| RNA-seq: | RNA sequencing | 
| mDEGs: | m7G-related differentially expressed genes | 
| GO: | Gene Ontology | 
| KEGG: | Kyoto Encyclopedia of Genes and Genomes | 
| TCGA: | The Cancer Genome Atlas | 
| GEO: | Gene Expression Omnibus | 
| GSEA: | Gene Set Enrichment Analysis | 
| GSVA: | Gene Set Variation Analysis | 
| KM: | Kaplan-Meier | 
| PCA: | Principal component analysis | 
| ROC: | Receiver operating characteristic | 
| NCI: | US National Cancer Institute. | 
Data Availability
The data could be acquired from TCGA dataset (https://portal.gdc.cancer.gov/) and GEO dataset (https://www.ncbi.nlm.nih.gov/geo/). The data of drug sensitivity could be acquired from the CellMiner database (https://discover.nci.nih.gov/cellminer). The code used in the study is available from the corresponding author on reasonable request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Authors’ Contributions
Wanggli Mei and Xuyang Jia designed experiments, collected samples, and conducted specific experimental operations. Statistical analysis, data collation, and interpretation are carried out by Liang Jin, Shiyong Xin, and Xianchao Sun. Bihui Zhang, Jiaxin Zhang, and Xiang Liu collected and compiled the data. Lin Ye, Guosheng Yang, and Ping Chen provided technical guidance and funding for experimental projects. Wangli Mei and Xuyang Jia contributed equally. Wangli Mei and Xuyang Jia are co-first author.
Acknowledgments
Thanks are due to TCGA, GEO, and CellMiner staff for the data. This work was supported by grants from the National Natural Science Foundation of China (Nos.81972409, 81672549) and Health System Independent Innovation Science Foundation of Shanghai Putuo District (No.PTKWWS201819).
Supplementary Materials
Supplementary Table 1: These genes were m7G-related genes including 26 genes in (GSEA) database ((http://www.gsea-msigdb.org/gsea/index.jsp)), and 16 other genes from research previously published, which were utilized to construct the prognostic signature in PRAD. Supplementary Table 2: The results of differential expression analysis of 42 m7G-related genes, and 16 DEGs were identified with . Supplementary Table 3: Differential expression genes between high- and low-risk groups, which were selected for functional enrichment analysis and immunocorrelation analysis. (Supplementary Materials)