Abstract

Background. Diffuse large B-cell lymphoma (DLBCL) is one of the largest lymphoma subcategories. Usually, 50%–70% of DLBCL patients can be cured by the standard treatment. But, at least one third have bad prognosis. Based on this situation, the research on DLBCL therapy strategy is still indispensable. Methods. A prognostic signature was built according to the public data and bioinformatics methods, the stability and reliability was assessed and validated. GSEA was performed to explore the difference in different groups. Consensus clustering and immune infiltration analysis were conducted comprehensively. Results. In this work, a signature based on multiple metabolism-associated genes (MTGs) was established, containing 16 MTGs, to predict the prognosis of DLBCL patients. The accuracy and effectiveness of this signature have been verified by three external validation sets. According to the risk formula, DLBCL patients were divided into high- and low-risk groups, and the survival rate of the low-risk group was significantly higher than that of the high-risk group. Furthermore, gene set enrichment analysis (GSEA) demonstrated that beta-alanine metabolism and regulation of actin cytoskeleton signal pathways were enriched in the low-risk group. The actual survival and nomogram-predicted survival matched well both in the training cohort and verification cohorts. Conclusion. In general, our prognostic signature can provide reliable and valuable information for medical workers in predicting the prognosis of DLBCL. A preprint was made available by the research square in the following link: “https://www.researchsquare.com/article/rs-1468741/v2.”

1. Introduction

As one of the largest lymphoma subcategories, the diffuse large B-cell lymphoma (DLBCL) with heterogeneous in clinical manifestations and prognosis is a malignant tumor originating from B lymphocytes [13]. Shipp classified two categories’ patients, critical serine/threonine phosphorylation pathways and apoptosis, through supervised learning and some genes responses to B-cell-receptor signaling [4]. The research of Dr. Staudt reviewed that DLBCL could be identified two clusters (germinal center B-cell-like (GCB-1ike) and activated B-cell-like (ABC-like)) and GCB-like expressed genes characteristic of germinal center B cells, whereas ABC-like expressed the characteristic of plasma cells [5, 6]. In the latest framework, ABC cluster is split into MCD, BN2, N1, and A53 four genetic subtypes, and GCB cluster is divided into EZB-MYC+, EZB-MYC–, ST2, and BN2 [79]. However, there are still 10%–20% of DLBCL cases that cannot be classified into these two types mentioned above [6, 10, 11], the new research showed that the unclassified DLBCL was enriched in BN2 subtypes [79].

In the past 20 years, the results of many phase III clinical trials have established the regimen of rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone (R-CHOP) as the standard treatment for DLBCL patients, usually 50%–70% of which are cured by this treatment method [1214]. However, the vast majority of the remaining patients is either intractable to this treatment, or relapse after remission [10]. Therefore, the research on DLBCL is absolutely essential.

A couple of prognosis models had been built, based on immune-related genes (IRGs), epigenetic modifications, lncRNA, or miRNA [1519]. More than two signatures were built based on IRGs, whereas the prognostic signatures were inconsistent, data sources and analytical methods may account for the inconsistent results, but the complexity of the pathogenic mechanism may be the main reason. It is essential to establish a prognosis signature from multiangle.

The attention of the researchers has been attracted by the intricate connection between metabolic disorders and tumorigenesis. Metabolic pathways are evolutionarily conserved in cells. So, metabolic reprograming has been regarded as a core feature of cancer cells [2023]. So far, several specific metabolic pathways have been found to be directly involved in the tumor metastasis and development [24]. The phosphatidylinositol-3-kinase (PI3K), protein kinase B (PKB/AKT), and mammalian target of rapamycin (mTOR) pathways play key roles in the proliferation and cell death of DLBCL [25, 26]. PI3Kδ signaling is overactive in many B-cell malignancies and has been shown to drive proliferation and transport to lymphatic tissues [26]. Glutamine metabolism can regulate various pathways in brain tumor cells, including macromolecule synthesis, energy production, epigenetic regulation, and redox homeostasis [27]. Inhibition of glutamine metabolism in vivo can interrupt the blood supply for tumors and may the increase treatment response [28, 29]. Therefore, inhibiting metabolic pathways or restoring those altered pathways is a promising therapeutic target strategy [30].

A single metabolic marker is not enough to reveal the complex metabolic environment [16]. So, it is urgently needed to find a feature based on multiple metabolism-associated genes (MTGs) to provide more reliable information for medical worker to effectively predict characteristics of a tumor microenvironment and prognosis of DLBCL patients. In this research, we combined the clinical information and expression profile of MTGs to assess overall survival (OS) for DLBCL patients. The prognostic model of MTGs was conducted, and its prognostic value was validated by the gene expression omnibus (GEO) databases (GSE32918, GSE10846, and GSE53786). This work could provide valuable information for clinicians in the process of DLBCL patients diagnosis, prognosis, and treatment guidance. A preprint has previously been published [31].

2. Materials and Methods

2.1. Collection of MTGs and Datasets

MTGs were collected from the GeneCards online database (https://www.genecards.org), which provided universal human genetic information [32, 33]. “Metabolism” was selected as the search keyword from main menu, the noncoding RNA or uncategorized gene Ids were deleted, then the left genes with relevance scores >8 were regarded as MTGs.

We adopted a GEO database (https://www.ncbi.nlm.nih.gov/geo/) to complete the following tasks: obtain the gene-expression profile matrixes of DLBCL from GSE11318 (203 tumor samples) and GSE8762 (10 common samples) for differential expression analysis (DEA), and the data from GSE11318 for training, whereas data from GSE32918, GSE53786, and GSE10846 for validation [3441]. Log2 transformation and canonicalization were performed on the expression profile. Packages, affyPLM, and affy were used to standardize the gene expression matrix [42]. R software 4.1.0 was used in this work. The information of GEO cohorts was summarized in Table 1 and the clinicopathological characteristics of each series had been displayed in Supplementary 1. More detail information was included in GEO database.

All data comes from the public databases; no ethics committee approval is required.

2.2. Differential Expression Analysis

DEA of those MTGs mentioned above in GEO cohorts was performed using the limma and gplots package [43]. Identifying statistical significance of differential expression genes (DEGs) was considered, while, the genes that the statistical significance reaches adjusted p-value (adj. p) <0.05 and |fold change (FC)| >0.5 were determined as the candidate DEG. At the same time, we also downloaded the DEGs ID of DLBCL patients from the gene expression profiling interactive analysis (GEPIA: http://gepia.cancer-pku.cn/index.html) [39]. Finally, the union of differentially expressed (DE)-MTGs obtained in the above two ways were taken for subsequent analysis.

2.3. Constructing a 16-MTGs Prognostic Signature

Univariate Cox proportional hazard regression analysis was performed for each DE-MTG to screen genes that were significantly related to the overall survival (OS) in GSE32918 and GSE11318, and then the common survival-related genes in the two data sets were taken. The least absolute shrinkage and selection operator (LASSO) Cox regression analysis was performed for those screened genes in GSE11318, then a multivariable prognostic signature with MTGs was founded by multivariate Cox proportional hazards regression analysis. We estimated the risk score of those genes with nonzero coefficients and generated a prognostic risk score for each patient according to the following formula: risk score = expression level of gene1 × j1 + expression level of gene2 × j2 + … + expression level of genex × jx, where j represents the coefficient. The GEO DLBCL patients from GSE11318 were divided into high- and low-risk groups with the median risk score as the cutoff value. The same formula was used to calculate the risk value of the validation sets. Because the data comes from different sequencing platforms and the standardization methods are different, respective median values were taken as cutoff values for the training and validation datasets.

Univariate and multivariate Cox proportional hazards regression analyses were performed to test whether the prognostic model based on MTGs was an independent prognostic factor. The Kaplan–Meier survival curve (K–M curve) was constructed, and the log-rank test was performed to evaluate the survival differences between high- and low-risk groups. Receiver operating characteristic (ROC) curve analysis was used to evaluate the sensitivity and specificity of the prognostic model. Calibration plots were applied to detect the differentiation between the actual and predicted OS probability. Decision curve analysis (DCA) was used to assess the utility of metabolism-associated prognostic signature and compare with the immune model established by Feng et al. [16].

2.4. Functional Enrichment Analysis and the Identification of Potential Drugs

Gene functional analysis is a considerable step to transform the molecular discovery of high-throughput methods into the biological significance. GSEA evaluates the expression profile of the whole genome at the level of gene sets, instead of focusing on only a few genes that are mostly changed [44]. To identify the relevant pathways and molecular mechanisms, GSEA was conducted between the gene expression data of high- and low-risk groups in the GSE11318 cohort. After 1,000 permutations, gene sets with and false discovery rate (FDR) less than 0.25 were considered significantly enriched. The R package “ClusterProfiler” and “org.Hs.eg.db” were used for statistical analysis and visualizing the functional profiles of the DE-MTGs, including gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. Adj. p-value <0.05 was taken as the critical value of significance. GDSC (https://www.cancerrxgene.org/) was used to screen the potential drugs for DLBCL patients [45].

2.5. Cluster and PCA Analysis

The expression data of MTGs from GSE11318 were grouped by the “ConsensusClusterPlus” package, and principal component analysis (PCA) was conducted to verify the cluster results [46]. PCA is a multivariate data analysis tool that can be used to reorganize the variables of a large multivariate data set, and reconstruct the first few variables accounting for most of the data differences [47]. Lastly, we used “survival” package to estimate the survival difference of subgroups.

2.6. Correlation Analysis between MTGs and Immune-Related Genes (IRGs)

IRGs were retrieved from the analysis Portal (ImmPort) (https://immport.niaid.nih.gov) and immunology database [48]. We conducted a correlation analysis between MTGs and IRGs using the “Hmisc” package. Genes, satisfying the correlation coefficient R > 3 and , were selected for this analysis. Finally, we visualized the values of R using pandas, seaborn, and matplotlib in python.

2.7. Immune Infiltration Analysis

We used CIBERSORT from Sangerbox (http://sangerbox.com/home.html) to detect the abundance of tumor-infiltrating immune cells. Pearson’s correlation coefficient and the estimated p value were utilized to evaluate the relationship between IRGs and immune infiltrating cells.

3. Results

3.1. Identification of DE-MTGs

Samples with survival time less than 50 days were kicked out to make model more reliable and reduce errors. Total of, 198 DLBCL patients with complete and qualified survival statistics from GSE11318 were used as training group. Total of, 1,937 MTGs were screened from the Genecards database. Total of, 750 MTGs differentially expressed in GEO cohort were shown in heatmap figure, and 749 of which were upregulated and 1 was downregulated (Supplementary 2). Total of, 982 DE-MTGs were obtained from GEPIA. The union of DE-MTGs (contain GEO and GEPIA cohorts) are shown by the Venn diagram. We conducted this research following the flowchart (Figure 1).

3.2. Construction of the 16-MTGs Prognostic Signature

There were 616 common genes among GEPIA, MTG, and differential expression MTG. Univariate Cox proportional hazard regression filtered out 28 MTGs significantly relating to OS from the 616 DE-MTGs (Figure 2). LASSO Cox regression analysis was performed to construct the prognostic signature and 16 MTGs were obtained. The LASSO coefficient profiles are shown in Figure 3(a). When 16 MTGs (ABCB7, CBX5, CHDH, FBXW7, GYG1, IFIH1, MPI, MYC, NOTCH3, POLG, PPAT, PRPS2, RARRES2, RB1, TREM2, and UMPS) were included, the prognostic signature performed best (Figure 3(b)). The prognostic risk score of each patient was calculated based on an mRNA expression level of these 16 MTGs, and patients were divided into high- and low-risk groups, according to the median risk score as the cutoff value (Figure 3(c)), and as the risk score increased, so did the number of patient deaths (Figure 3(d)). The detailed information of regression coefficients and the risk formula were shown successively in Table 2 and as follows.

The applicability of the prognostic model was tested via integrating the results of K–M curve, ROC curve, and decision curve. The time-dependent ROC curves of the prognostic genes signature in GSE11318 training cohort according to single gene was shown in Figure 4(a). The results of K–M analysis had revealed that the high expression of CHDH, GYG1, MPI, POLG, and PPAT increased the risk of death, while IFIH1, NOTCH3, RARRES2, and TREM2 have diametrically opposite effects (Supplementary 2). The K–M curve also suggested that the survival probability of the low-risk group was higher than that of the high-risk group (; Figure 4(d)). The actual survival and nomogram-predicted survival matches well, as shown by the calibration curves (Figure 4(b)) and decision curves. The prognostic signature established in this study has a higher net benefit than the default strategies and outperforms the immune gene-related model established by Feng et al. [16] across the relevant threshold range (Figure 4(c)).

3.3. Verification of the 16-MTGs Prognostic Signature in GEO Cohort

The GSE32918, GSE10846, and GSE53786 cohorts were used to verify the prognostic value of this model. The verification results were the same as expected, the low-risk group has a higher probability of survival and longer survival time than the high-risk group, the high-prognostic value of the signature was also observed as ROC curves showed (area under curve, AUC = 0.792 at 3 years and AUC = 0.737 at 5 years in GSE32918; AUC = 0.639 at 5 years and AUC = 0.679 at 10 years in GSE10846; AUC = 0.639 at 5 years and AUC = 0.657 at 10 years in GSE53786) and the actual survival and nomogram-predicted survival in which matched well. AUC was defined as the area under the ROC curve enclosed by the axes, which is used to evaluate the accuracy of the model. In general, 16-MTGs prognostic signature is valuable for the prognostic prediction of DLBCL patients. Similarly, through DCA, for the three validation sets, the 16-MTGs prognostic signature is better than the prognostic model of IRGs previously reported (Supplementary 2).

3.4. Exploration of Related Signal Pathways and Potential Drugs

We detected that beta-alanine metabolism and regulation of actin cytoskeleton signal pathway enriched in the low-risk group from the GEO cohort by GSEA (Supplementary 2). At the same time, the GO function enrichment and KEGG pathway enrichment analysis were also performed. The results revealed that 29 GO terms and 7 KEGG pathways were significantly enriched. The GO function enrichment showed that the MTGs were mainly associated with coenzyme binding, lyase activity, transferase activity, transferring alkyl or aryl (other than methyl) groups, modified amino acid binding, electron transfer activity, iron–sulfur cluster binding, metal cluster binding, flavin adenine dinucleotide binding, oxidoreductase activity, acting on the CH–CH group of donors, NAD or NADP as acceptor, glutathione transferase activity, nuclear receptor activity, transcription factor activity, direct ligand regulated sequence-specific DNA binding, promoter-specific chromatin binding, glutathione binding, and oligopeptide binding in molecular function (Figure 5(a)). The result of KEGG pathways enrichment analysis demonstrated that MTGs were mainly enriched in pathways of biosynthesis of cofactors, drug metabolism-other enzymes, biosynthesis of amino acids, valine, leucine, and isoleucine degradation, cysteine and methionine metabolism, N-Glycan biosynthesis, and glutathione metabolism (Figure 5(b)).

There were 162 compounds found for DLBCL based on differentially expressed genes between healthy and DLBCL patients. Among 398 DEG, between high- and low-level patients, were used to searching potential drugs, only 6 were identified as target genes and 17 potential drugs were found. DNA replication, mitosis, EGFR signaling, apoptosis regulation, chromatin histone acetylation, ABL signaling, ERK MAPK signaling, PI3K/MTOR signaling, genome integrity, kinases, protein stability and degradation, RTK signaling, cell cycle, and WNT signaling may be the major target pathways. The detailed information of target genes, target pathways, and genome feature of patients were summarized in Supplementary 1 and Supplementary 2.

3.5. Identification of Two Clusters for DLBCL Patients

All MTGs from GSE11318 were used for consensus clustering analysis. The result illustrated that decreasing delta data from 2 to 9 of k, k = 2 was selected as an optimal cluster based on consensus CDF, and two different subgroups were identified (Figure 6(a)6(d)). To validate the result of the cluster, we conducted PCA analysis and found that the expression of prognosis-related MTGs showed obvious group segregation (Figure 6(e)). Furthermore, the K–M survival analysis of cluster samples illustrated that the OS of Group 2 was significantly lower than that of Group 1 (Figure 6(f)).

3.6. Correlation between MTGs and Survival-Related Immune Genes at Transcription Levels in DLBCL Patients

To better present the relationship between MTGs from GSE11318 and survival-related immune genes, correlation analysis was performed. As the correlation heatmap was shown in Figure 7, most MTGs were correlated with survival-related immune genes, especially CBX5, MPI, and PPAT. Besides, CBX5 and MPI were positively correlated with these genes mostly, while PPAT is negatively correlated with them. On the contrary, CHDH and POLG have no significant correlation with survival-related immune genes (Supplementary 2).

3.7. Immune Infiltration

To explore the differences between high- and low-risk groups on the cellular level and the genome level, 24 types of infiltrating immune cell scores were compared, and the same as immune gene-sets. The results showed big differences between high- and low-risk groups in terms of immune cell infiltration and immune gene-set activity. We found that the expression levels of naïve B cells, memory B cells, plasma cells, monocytes, M0 macrophages, M1 macrophages, M2 macrophages, resting dendritic cells, activated dendritic cells, resting mast cells, and activated mast cells are higher in the high-risk group. While various types of T cells, containing CD8 T cells, naïve CD4 T cells, resting memory CD4 T cells, activated memory CD4 T cells, and follicular helper T cells are higher in the low-risk group (Figure 7).

4. Discussion

The altered energy metabolism is one of the most important signs of cancer development. The famous Warburg effect exists in majority of cancers [4951]. More and more evidence proved that the chaotic energy metabolism played a significant role in occurrence and development of DLBCL. The high expression of lactate dehydrogenase (LDH) is a dangerous signal for patients [5254]. The inhibitor of monocarboxylate transporter 1 (MCT1) suppressed the growth of DLBCL significantly when combining with mitochondrial respiratory Complex I inhibitor [55, 56]. The inhibition of mitochondrial oxidative phosphorylation by OPB-111077 or metformin could improve the clinical outcome of DLBCL [57, 58].

Many previous studies have shown that multiple MTGs can be used as a diagnostic tool and provide medical workers to predict values for many types of cancer, such as prostate cancer, lung adenocarcinoma, low-grade glioma, oral squamous cell carcinoma, and so on [30, 59, 60]. In this work, a prognosis signature was established using 16-MTGs. Furthermore, unsupervised cluster analysis was performed, that result demonstrated that DLBCL patients could be identified two clusters and the overall survival rates of the two subgroups were different. The new classification subtype is not exactly the same as the previous one, but it provides energy-related molecular characteristics. This result can deepen our understanding of the changes in energy metabolism during the pathogenesis of DLBCL. An IRG signature was conducted for DLBCL in the previous research, and we compared these two models by DCA and found that the metabolic-related prognostic model we established, perform better.

All of these genes were involved in the regulation of cellular metabolism in a direct or indirect manner. Some are directly involved in the TCA cycle (PPAT), some in lipid metabolism (CBX5, FBXW7, and TREM2), some in glucose metabolism (RB1), some in nucleotide metabolism (PRPS2), and some in regulating mitochondrial function (ABCB7 and POLG). The expression of ABCB7 was critical in regulating iron metabolism and maintaining mitochondrial function [61, 62]. One of the main functions of CBX5 is to participate in gene silencing. The loss of CBX5 confers EGFR inhibitor resistance in lung cancer cells [63]. CBX5 plays different roles in different cancers, such as increased CBX5 transcription causes carcinogenesis in gliomas [64], but its expression product HP1α in breast cancer can inhibit the metastasis of cancer cells [65]. Cholesterol dehydrogenase (CHDH) regulates lipid metabolism by catalyzing cholesterol oxidation in an oxygen-independent manner [66]. FBXW7 determines cell fate by modulating lipid metabolism and glutamine metabolism [67, 68]. Glycogenin-1 (GYG1) is involved in glycogen synthesis [69]. Mannose phosphoisomerase (MPI) is the first enzyme that leads to the production of GDP-Mannose [70]. POLG is critical to the mitochondrial function and mitophagy in astrocytes [71, 72]. Phosphoribosyl-pyrophosphate synthetase 2 (PRPS2) promotes nucleotide biosynthesis [73]. The function of RAPPES2 in regulating lipid metabolism has been investigated in breast cancer [74]. The loss of RB1 promotes a glycolytic phenotype [75]. TREM2 affects the metabolism of cholesterol, myelin, and phospholipids [7679]. MYC is usually overexpressed in almost all cancers, and its ability to alter metabolism had been well-established. MYC could induce aberrant choline metabolism by transcriptionally activating the key enzyme phosphate cytidylyltransferase 1 choline-α (PCYT1A) [80]. Minoacyl-tRNA_biosynthesis pathway was significantly influenced by NOTCH3 [81].

Through survival analysis, we found the prognosis of DLBCL significantly correlated with nine MTGs, containing CHDH, GYG1, MPI, POLG, PPAT, IFIH1, NOTCH3, RARRES2, and TREM2. Through literature review, we found these genes are all related to the occurrence and development of cancers or tumors. The high expression of IFIH1, NOTCH3, RARRES2, and TREM2 shows satisfactory prognosis for DLBCL. While the high expression of CHDH, GYG1, MPI, POLG, and PPAT is negatively correlated with the prognosis of DLBCL. Consistent with our results, the high expression of NOTCH3 is positively correlated with the good prognosis of metastatic medullary thyroid cancer [82]. As a tumor suppressor gene, RARRES2 increases its expression in breast cancer and inhibits tumor growth by recruiting NK and T cells [83]. Similarly, our results also showed that high expression of RARRES2 was beneficial to the survival of DLBCL. However, some of our results were not consistent with the previous. The high expression of CHDH and POLG has a poor prognosis in DLBCL, while, there are antithetical outcomes for the research of estrogen receptor (ER)—positive breast cancer and gastric cancer [84, 85]. Overexpression of NOTCH3 is associated with poor prognosis in human ovarian epithelial cancer [86]. Moreover, the high expression of TREM2 is related to the poor prognosis of gastric cancer [86]. In a word, our research results were basically consistent with the previous research results and proved that the screened genes are valuable.

GEAs were conducted to gain insight into the molecular mechanism of the occurrence and development of DLBCL. Beta-alanine metabolism and regulation of actin cytoskeleton signaling pathway were enriched in the low-risk DLBCL group by GSEA. The actin cytoskeleton is a highly dynamic network composed of actin polymers and a large number of related proteins. The remodeling and function of the actin cytoskeleton are regulated by signal pathways [87]. Resibufogenin, as a traditional Chinese medicine for the treatment of a variety of malignant tumors, inhibits the growth of ovarian clear cell carcinoma (OCCC) by downregulating the actin cytoskeleton signaling pathway [88]. Studies have found that the actin cytoskeleton signaling pathway is significantly affected in patients with cervical cancer [89]. Previous studies have shown that the stromal supply of alanine supports the metabolism, growth, and treatment resistance of pancreatic ductal adenocarcinoma [90, 91].

KEGG pathways enrichment analysis indicated that these pathways contained biosynthesis of amino acids, valine, leucine, and isoleucine degradation, cysteine, and methionine metabolism. Almost all enriched pathways have been confirmed to be closely related to cancer. Glutathione metabolism is essential for normal cell function [71, 72, 9296]. Many cancer types, including breast, liver, colon, and gastric cancers, have increased glutathione content in cancer stem cells and counteract the activity of antineoplastic agents through the detoxification ability of glutathione [97, 98]. In the study of esophageal cancer, it has been confirmed that inhibiting glutathione metabolism can slow down the progression of cancer syndrome [99]. Oncogene-driven cell growth activation is linked to the increased amino acid uptake and biosynthesis [100]. When cysteine synthesis is upregulated in cancer, the upregulation of catabolic pathways and the expression of cysteine transporters are often observed [100]. Tumor cells enthusiastically consume methionine by expressing high levels of the methionine transporter SLC43A2, and then T cells cannot obtain methionine, resulting in impaired T cell immunity [101]. The change of N-glycan biosynthesis can interfere with cell proliferation, differentiation contributes to tumor development and enhance the metastasis and spread of primary tumors [102].

Considering the vital role of immunity, we conducted the correlation analysis between MTGs and IRGs and immune infiltration analysis. Interestingly, among the 16 genes in this prognostic signature, the genes that had the weaker impact on the prognosis of DLBCL, such as CBX5, MPI, and PPAT, was more closely related to the immune genes. However, their inner connection needs further research. At the same time, a study has found that there is a significant correlation between the expression of CBX and the infiltration of immune cells (B cells, CD8+ T cells, CD4+ T cells, macrophages, neutrophils, and dendritic cells) [103]. These immune cells are highly consistent with those extracted from our immune infiltration analysis. The expression level of follicular helper T cells was higher in the low-risk DLBCL group, indicating that cancer patients with a high proportion of helper T cells have a higher survival rate, which has been verified in acute myeloid leukemia and melanoma [104, 105]. The presence of M2 macrophages was conducive to the growth and spread of tumors, which was indirectly verified by our result that M2 macrophages are highly expressed in the high-risk DLBCL group.

Although our results have clinical prognostic significance, their limitations also exist. First, compared with widely studied cancer types, the sample size of DLBCL is not big enough. Second, the development of 16-MTGs prognostic signature depended on the RNA-seq series, and the procedures of sample processing, RNA extraction, reverse transcription, and detection needs to be standardized and normalized. Third, we have verified the prognostic value of this signature through three external data sets, but in vivo or in vitro experimental verification were not conducted.

5. Conclusion

In summary, 6-MTGs prognostic signature was established and verified, the 3-, 5-, and 10-year survival rate of DLBCL patients can be well predicted. This signature provides potential clinical application value for the precise treatment strategy of DLBCL.

Abbreviations

DLBCL:Diffuse large B-cell lymphoma
MTGs:Metabolism-related genes
GSEA:Gene set enrichment analysis
GCB-1ike:Germinal center B-cell-like
ABC-like:Activated B-cell-like
R-CHOP:Regimen of rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone
PI3K:Phosphatidylinositol-3-kinase
PKB/AKT:Protein kinase B
mTOR:Mammalian target of rapamycin
OS:Overall survival
GEO:Gene expression omnibus
GEPIA:Gene expression profiling interactive analysis
DE-MTGs:Differentially expressed-MTGs
LASSO:Least absolute shrinkage, and selection operator
K–M:Kaplan–Meier
ROC:Receiver operating characteristic
DCA:Decision curve analysis
FDR:False discovery rate
GO:Gene ontology
KEGG:Kyoto encyclopedia of genes and genomes
PCA:Principal component analysis
IRGs:Immune-related genes
ImmPort:Immunology database and analysis portal
ER:Estrogen receptor
MPI:Mannose phosphate isomerase
OCCC:Ovarian clear cell carcinoma.

Data Availability

All the original data in this study were obtained from the online repository. Please see the text for the name of the specific repository/repositories and accession number(s).

Disclosure

This paper is already published in the preprint given in the following link: “https://www.researchsquare.com/article/rs-1468741/v2.”

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

SJ W was responsible for conceptualization, writing—original draft, and writing—review and editing. PC F and XQ W had contributed to the software visualization and writing—review and editing. GX L was responsible for the software validation and editing. All authors contributed to this article and agreed to the submitted version.

Acknowledgments

We thank the Department of Chemistry in Changzhi University, and the Departments of Anesthesiology and Basic Medicine in Changzhi Medical College. This study is financially supported by the Basic Research Program of Shanxi Province (202103021223382) and the Chunhui Project of Changzhi Medical College (QD201918).

Supplementary Materials

Supplementary 1. Table S1: detailed information of clinicopathological characteristics of each series. Table S2: detailed information of enrichment analysis. Table S3: potential drugs for DLBCL patients.

Supplementary 2. Figure S1: the differential expression genes of GSE11318 and GSE8762 shown in heatmap. Figure S2: Kaplan–Meier plotters showed overall survival of MTGs in DLBCL. Figure S3: verification of the prognostic 16-MTGs signature in GEO cohorts. Figure S4: the legends at the top of the image showed the results of analyzing the differences between the high- and low-risk groups using T-tests. Figure S5: GSEA depicts the biological pathways related to the risk score value of the prognostic 16-MTGs signature and pathways enriched in the low-risk group. Figure S6: the correlation between 16 MTGs and survival-related immune genes.