Abstract

Background. Head and neck squamous cell carcinoma (HNSCC) is a growing concern worldwide, due to its poor prognosis, low responsiveness to treatment, and drug resistance. Since immunotherapy effectively improves HNSCC patients’ survival status, it is important to continuously explore new immune-related predictive factors to accurately predict the immune landscape and clinical outcomes of individuals suffering from HNSCC. Methods. The HNSCC transcriptome profiling of RNA-sequencing data was retrieved from TCGA database, and the microarray of GSE27020 was obtained from the GEO database for validation. The differentially expressed genes (DEGs) between HNSCC and normal samples were identified by multiple test corrections in TCGA database. The univariate and multivariate Cox analyses were performed to identify proper immune-related genes (IRGs) to construct a risk model. The Cox regression coefficient was employed for calculation of the risk score (RS) of IRG signature. The median value of RS was utilized as a basis to classify individuals with HNSCC into high- and low-risk groups. The Kaplan-Meier (K-M) survival analysis and receiver operating characteristic (ROC) curves were employed for the identification of the prognostic significance and precision of the IRG signature. The signature was also evaluated based on clinical variables, predictive nomogram, mutation analysis, infiltrating immune cells, immune-related pathways, and chemotherapeutic efficacy. The protein-protein interaction (PPI) network and functional enrichment pathway investigations were utilized to explore possible potential molecular mechanisms. Finally, the hub gene’s differential mRNA expression levels were evaluated by means of the Gene Expression Profiling Interactive Analysis (GEPIA), and the Human Protein Atlas (HPA) was utilized for the validation of their translational levels. Results. Collectively, 1593 DEGs between HNSCC and normal samples were identified, of which 136 IRGs were differentially expressed. Then, the 136 immune-related DEGs were mostly enriched in the cytokine-related signaling pathways by GO and KEGG analyses. After that, a valuable signature based on seven genes (DKK1, GAST, IGHM, IL12RB2, SLURP1, STC2, and TNFRSF4) was designed. The HNSCC patients into the low-risk group and the high-risk group were divided by using the median RS; the HNSCC patients in the high-risk group had a worse survival than those in the low-risk group. The risk signature was verified to be an independent predictive marker for HNSCC patients. Meanwhile, the RS had the largest contribution to survival of these patients based on the predictive nomogram. In addition, the low-risk HNSCC patients exhibited significantly enriched immune cells, along with an association with high chemosensitivity. Conclusion. The constructed gene signature can independently function as a predictive indicator for the clinical features of HNSCC patients. The low-risk HNSCC subjects might benefit from immunotherapy and chemotherapy.

1. Introduction

As the sixth most prevalent type of malignancy, head and neck squamous cell carcinoma (HNSCC) is the seventh main cause of cancer-related mortalities globally [1]. A study conducted in the United States predicted that by 2022, approximately 66,470 new HNSCC cases would arise and 15,050 HNSCC-related deaths would occur [2]. Despite steady advancements in relevant medical treatments, like surgery, radiotherapy, and chemotherapy, the five-year survival of individuals with HNSCC has not significantly improved [3]. Therefore, finding new and innovative novel prognostic factors for HNSCC patients is an urgent need.

HNSCC is considered an immunodeficiency disease. The main mechanisms underlying the disease include the induction of immune tolerance, local immune escape, and the destruction of T-cell signals [4]. The immune microenvironment of HNSCC has been widely studied [5, 6]. For instance, the human leukocyte antigen (HLA) is responsible for the vital function of transmitting signals between tumor antigen peptides and killer T cells [7]. A previous study demonstrated that more than 50% of HNSCC patients had low HLA expression, with extensive lymph node metastasis and poor prognosis [8]. HNSCC tumor cells could also release chemical factors, to induce many immunosuppressive hematological cells to enter the immune microenvironment, thus suppressing the immune response [9].

Meanwhile, studying how HNSCC survival is linked to infiltrating immune cell proliferation and function could improve the survival of HNSCC patients [1012]. Immunotherapy’s effect on the clinical outcomes of individuals with HNSCC has also been intensively studied [1315]. This is why the exploration of immune-related biomarkers to anticipate the clinical features of individuals with HNSCC is imperative. Recent studies demonstrated that immune-related biomarkers could affect the biological behavior of HNSCC as well as the status of patients. For instance, Yao et al.’s model consisted of four immune-related genes (IRGs), including PVR, TNFRSF12A, IL21R, and SOCS1 [16]; Chen et al. constructed predictive model based on three IRGs (SFRP4, CPXM1, and COL5A1) [17]; and Zhang et al. established a model based on six IRGs (PLAU, STC2, TNFRSF4, PDGFA, DKK1, and CHGB) for the prognostic prediction of HNSCC [18]. Although these studies have constructed a proper model to predict the prognosis of patients with HNSCC, the progression of HNSCC is complexity and uncertainty. Therefore, to date, reliable and predictive biomarkers for identifying HNSCC are still limited; it is essential to continuously look for newly representative biomarkers.

During this research, a risk signature of seven immune-related genes was developed for accurately predicting the clinical outcomes of HNSCC subjects, which may provide an effective treatment strategy for these patients.

2. Methods

2.1. Dataset

The transcriptome profiling of RNA-sequencing (FPKM) was attained from The Cancer Genome Atlas (TCGA) (https://cancergenome.nih.gov/) containing 502 cancerous and 44 healthy tissues samples, along with the relevant clinical information for 528 HNSCC patients. In total, 2483 IRGs were procured from The ImmPort database (https://immport.niaid.nih.gov/) [19, 20]. Besides, the microarray and clinical information of GSE27020 containing 109 HNSCC samples were provided by the Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/gds/) for authentication.

2.2. Development of the IRG-Based Signature

The differentially expressed genes (DEGs) between cancerous and healthy samples were identified with the help of TCGA database after multiple test corrections by false discovery rate and [21]. Then, screening the intersections of these DEGs with IRGs was carried out. Univariate Cox proportional hazards regression analysis was employed for the determination of the predictive ability of IRGs for the overall survival (OS) of individuals with HNSCC with the aid of the “survival” package in R [22]. The genes with a threshold of were subjected to further evaluation using multivariate Cox regression analysis [23]. Then, the expression levels of the hub genes were compared for further exploring their expression features in normal and HNSCC tumor tissues. Subsequently, the calculation of the IRG-based signature-related risk score (RS) was done using the Cox regression coefficient and gene expression formula given below:

, Expi, and Co-eff indicate signature gene number, gene expression levels, and regression coefficient values, respectively. Using the median value of RS as a criterion, the individuals with HNSCC were classified into the low- and high-risk groups.

2.3. Prognosis Prediction by the IRG-Based Signature

For the verification of the IRG-based signature’s prognostic performance, the signature’s impact on OS in both risk groups was subjected to comparison by Kaplan-Meier (K-M) survival analysis utilizing the “survival” package in R software, followed by the area under the curve (AUC) of the receiver operating characteristic (ROC) curves to assess IRG-based signature’s accuracy through the “survival ROC” package in R software [24]. Subsequently, the GEO database (GSE27020) was used for external validation. It was also subjected to K-M survival analysis and ROC curves to identify the signature’s prognostic value and precision. The RS distribution and survival status of individuals with HNSCC in TCGA were constructed to further understand the prognostic capability of the signature.

2.4. Correlation Analysis

The association of IRG-based signature with clinical variables (age, pathological grade, gender, and tumor and TNM stages) was analyzed. Moreover, the clinical factors and robustness of the signature in predicting the OS were demonstrated by employing univariate and multivariate Cox regression analyses.

2.5. Construction of a Predictive Nomogram

Based on clinical variables, a nomogram was used to establishment a prognostic scoring system for predicting survival in HNSCC patients both in TCGA and GSE27020 databases.

2.6. Somatic Mutation Analysis

We obtained the somatic mutation profiles of all tumor samples from TCGA database and explored the mutation analysis for 528 patients. The R software “maftools” package was utilized to analyzed and visualized for mutation data of the low-risk group and the high-risk group.

2.7. Immune Microenvironment Analysis

Considering the involvement of infiltrating immune cells in the tumor microenvironment, the single-sample gene set enrichment analysis (ssGSEA) algorithm was utilized to evaluate the immune score of each HNSCC sample from TCGA and GSE27020 databases [25]. The different proportions of the infiltrating immune cells between the low- and high-risk groups were assessed by the Wilcoxon test. Moreover, the association between the RS and immune-related biological functions was performed for further exploring the underlining mechanisms. The gene expression profiles corresponding to samples of TCGA and GSE27020 databases were selected to perform the gene set variation analysis (GSVA).

2.8. Prediction of Clinical Application

The calculation of the half inhibitory concentration (IC50) of common chemotherapeutic agents was done, and the differences in the IC50 across the two risk groups were also evaluated for predicting the clinical application of the IRG-based signature both in TCGA and GSE27020 databases.

2.9. Molecular Mechanism Analysis

The STRING biological database (https://string-db.org/) was applied for extraction of the protein-protein interaction (PPI) network [26] as a mathematical representation of the physical contacts among differentially expressed IRGs linked to HNSCC patient survival. Thereafter, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were employed in determining the potential function of the immune-related DEGs [27].

2.10. Investigating the Expression of Hub Genes

The Gene Expression Profiling Interactive Analysis (GEPIA) (http://gepia.cancer-pku.cn/) was utilized for the investigation of the differential mRNA expression profiles of the hub genes in the IRG-based signature. Moreover, the Human Protein Atlas (HPA) (https://www.proteinatlas.org/) was employed for the purpose of validating the translational levels of these hub genes.

2.11. Statistical Analysis

R version 3.6.2 was utilized to conduct statistical analysis procedures. DEGs were compared with multiple test corrections with were viewed as being dramatically dysregulated. The survival curves were estimated by using the K-M survival analysis and log-rank test between different groups. Clinicopathological features were compared by univariate and multivariate Cox regression analyses. The ssGSEA algorithm and Wilcoxon test were used to compare different proportions of the infiltrating immune cells between different groups. The -test or Wilcoxon test for comparisons of two variables, and a (two-side) was taken as a statistically significant value.

3. Results

3.1. Differential Gene Expression Analysis

TCGA database was employed to retrieve the HNSCC RNA-sequencing data comprising 502 tumor samples and 44 healthy samples. Among these patients, 528 HNSCC subjects with gene expression profiles and clinical follow-up data were included. The workflow of this research is illustrated in Figure 1. In the differential gene expression analysis, 1593 DEGs between HNSCC and healthy samples were identified (Figures 2(a) and 2(b)), of which 136 IRGs were differentially expressed (Figures 2(c) and 2(d)). Among these genes, 13 genes were identified to predict survival in the univariate Cox regression analysis. To study their interactions, the STRING biological database was utilized to construct the PPI network, containing 11 nodes and 22 edges. Based on the degree of genes, IL1A, CTLA4, CCR8, IL12RB2, TNFRSF4, CXCL13, and PLAU appeared to be the core genes among these IRGs (Figure 3(a)). As for functional analysis in Figure 3(b), the 136 immune-related DEGs were mostly enriched in immune response/cytokine mediation (BP), immunoglobulin complex/external side of plasma membrane (CC), and cytokine activity/signaling receptor activator activity/receptor ligand activity (MF) by GO analysis. By KEGG pathway analysis, the genes were mostly enriched in cytokine-cytokine receptor interaction, viral protein interaction with cytokine and cytokine receptor, and chemokine signaling pathway.

3.2. Development and Verification of the IRG-Based Prognostic Signature

The detailed characteristics along with population demographics are given in Table 1. To develop a predictive IRG-based signature, seven IRGs were chosen after univariate and multivariate Cox analyses (Table 2). Meanwhile, the expression levels of the seven IRGs were further investigated. Compared with normal tissues, only SLURP1 was downregulated in tumor tissues, while the expression levels of DKK1, GAST, IGHM, IL12RB2, STC2, and TNFRSF4 were upregulated in tumor tissues (Figure 3(c)). Then, the calculation of the RS of this IRG-based signature was done as follows: .

Moreover, individuals with HNSCC were categorized into the low-risk group () and the high-risk group () as per the median RS. HNSCC patients at high risk showed a worse survival in the K-M analysis, in comparison to the patients at low risk () (Figure 4(a)), with the AUC of 0.685 for the 5-year ROC curve (Figure 4(b)), indicating a certain predictive value of the signature in predicting the survival of individuals with HNSCC. Meanwhile, this IRG-based signature was validated in the GEO database (GSE27020) of 109 HNSCC patients who were also grouped into the low-risk group () and the high-risk group (). Consistent with TCGA database, the K-M analysis of the GEO data exhibited that high-risk HNSCC individuals presented a worse survival in comparison with the low-risk group () (Figure 4(c)), with the AUC of 0.637 for 5-year ROC curve (Figure 4(d)).

Additionally, there were more deaths in HNSCC patients with the elevation in the value of RS (Figures 4(e) and 4(f)), and the seven genes showed differences in mRNA expression across the two groups in the heat map (Figure 4(g)).

3.3. Use of IRG-Based Signature as an Independent Prognostic Marker for HNSCC

Univariate and multivariate Cox regression analyses were conducted for assessing the correlations between the IRG-based signature and clinical variables (age, gender, grade, tumor stage, and TNM stage). The findings indicated that OS of HNSCC patients was significantly associated with age (), M stage (), and the RS calculated from the IRG-based signature () (Figure 5(a)) in univariate Cox regression analysis and also with age (), M stage (), and the RS () in multivariate Cox regression analysis (Figure 5(b)).

Furthermore, the associations between clinicopathological parameters and the RS and associations between the seven genes and clinical variables were also evaluated (Table 3, Figure 5(c)). The results revealed that the male gender, high tumor stage, and T stage were linked to a greater value of RS. In addition, mRNA expression levels of IGHM and SLURP1 appeared to be elevated in females in comparison to males. The mRNA expression level of STC2 appeared to be elevated in males when compared with females, and a high pathological grade was correlated with lower mRNA expression of GAST and SLURP1. The results also suggested that higher mRNA expression of IGHM was remarkably linked to a high grade. Greater mRNA expression of GAST was significantly linked to a more advanced tumor stage. The elevated mRNA expression level of DKK1 and GAST was correlated with the advanced T stage. Moreover, lower mRNA expression of IGHM, as well as higher mRNA expression of SLURP1 and STC2, was correlated with the advanced M stage.

3.4. Construction of a Predictive Nomogram

The clinical variables and RS were included in the nomogram. As indicated in the nomogram, the RS had the largest contribution to survival of patients with HNSCC both in TCGA and GSE27020 databases (Figures 6(a) and 6(b)).

3.5. Somatic Mutation Analysis

We obtained somatic mutation profiles of 528 patients in TCGA database. Around 241 (97.57%) and 229 (93.47%) samples possessed somatic mutations in the high-risk and low-risk groups, respectively. The top 30 mutated genes for high-risk and low-risk groups are shown in Figures 7(a) and 7(b). The results indicated that the TP53 mutated most frequently approximately accounting for 78% and 62% in the high-risk and low-risk groups, respectively.

3.6. Immune Microenvironment Analysis

The association between 23 immune cells infiltration differences and different risk groups was analyzed in TCGA and GSE27020 databases. Patients in the low-risk group showed higher infiltration levels of these 23 immune cells in TCGA database (Figure 8(a)), and patients in the low-risk group were more correlated with the infiltration of activated CD8 T cell, activated dendritic cell, CD56dim natural killer cell, eosinophil, immature B cell, mast cell, MDSC, monocyte, natural killer cell, natural killer T cell, neutrophil, T follicular helper cell, type 1 T helper cell, and type 17 T helper cell in the GSE27020 database (Figure 8(b)).

In addition, the relationship between immune pathway scores and RS were analyzed in order to better explore the immune-related biological functions. Functions with a correlation greater than 0.2 and are shown in Supplementary Figure 1. The results indicated that 14 immune-related pathways were correlated negatively with the RS in TCGA database (Supplementary Figure 1A). In the GSE27020 database, 8 immune-related pathways were correlated negatively with the RS, while 1 was correlated positively (Supplementary Figure 1B). These immune-related pathway scores vary with increasing levels of RS, implying that an imbalance in these pathways is closely related to tumor development.

3.7. Prediction of Clinical Application

The association of risk with the therapeutic efficacy of common chemotherapeutic agents in HNSCC was also studied. The findings exhibited that the low-risk HNSCC patients presented increased sensitivity to Elesclomol, GW843682X, Midostaurin, Pazopanib, QS11, and Salubrinal in TCGA database (Figure 9(a)), and the low-risk group was more likely with higher sensitivity of Bexarotene, BI.2536, MG.132, QS11, Salubrinal, and Thapsigargin in the GSE27020 database (Figure 9(b)). The results indicated that HNSCC patients with low risk represented higher sensitivity to chemotherapy.

3.8. Investigation of the Expression of the Seven IRGs

The expression of the seven IRGs in HNSCC was explored with the help of the GEPIA database. The expression levels of the seven IRGs varied remarkably across cancerous and healthy tissues (Figure 10(a)). However, to validate these findings, more experimental analyses were required. Moreover, the HPA database was employed to investigate the expression of the seven IRGs at the translation level. Among the seven IRGs, expressions of IGHM and SLURP1 were lower in the HNSCC tissues. Moreover, STC2 showed higher expression in HNSCC by immunohistochemistry. No remarkable variations were observed in the expressions of GAST, IL12RB2, and TNFRSF4 across normal and HNSCC tissues, while DKK1 was not detected by immunohistochemistry in the HPA database (Figure 10(b)). However, to further validate the translational relevance of the seven IRGs on HNSCC, more clinical analyses on HNSCC samples are needed.

4. Discussion

During this research, an IRG-based signature was established, which was capable of anticipating the clinical landscapes of HNSCC patients and correlated with clinicopathological characteristics of affected individuals, the numbers of tumor-infiltrating immune cells, and the efficacy of common chemotherapeutics. These findings suggested that this signature may be valuable for predicting HNSCC-related prognosis and provide good clinical application in immunotherapy and chemotherapy.

The IRG-based signature consisted of seven genes (i.e., DKK1, GAST, IGHM, IL12RB2, SLURP1, STC2, and TNFRSF4). Among them, DKK1 is a member of the DKK family and regulates cell proliferation, migration, and apoptosis in various tumor tissues through β-catenin-dependent and β-catenin-independent mechanisms [28]. Moreover, as a tumor suppressor gene, DKK1 causes apoptosis and suppresses cell proliferation [29]. Gao et al. suggested that elevated DKK1 expression levels can predict poor prognosis in HNSCC patients [30]. STC2 regulates tumor cell proliferation, apoptosis, and angiogenesis and is also vital for the invasiveness and metastasis of HNSCC [31]. IL12RB2 is a subunit of the IL-12 receptor, and an increased ratio of IL12RB2-positive tumor-infiltrating lymphocytes is indicative of a good prognosis in laryngeal cancer [32]. SLURP1 belongs to the Ly6/uPAR family that lacks a GPI-anchoring signal sequence and is associated with a poor prognosis of HNSCC [33]. Furthermore, one of the tumor necrosis factor receptors, TNFRSF4, could be a useful target for immunotherapy of HNSCC [34]. Although there are no published reports on GAST and IGHM for HNSCC, these genes may be related to tumorigenesis and development [35, 36]. In general, these previous findings emphasize the importance of these seven genes in HNSCC prognosis prediction. Furthermore, the expression levels of GAST, IL12RB2, and TNFRSF4 in HNSCC samples appeared to be elevated in healthy tissues from the GEPIA database, while no apparent variations were observed between cancerous and healthy tissues from the HPA data. Except for SLURP1 and STC2, IGHM expression in HNSCC tissues was remarkably increased compared to that in healthy samples from the GEPIA database, which was inconsistent with the HPA database. This could be due to abnormal methylation. However, further experimentation is required to confirm this finding.

In the multivariate analysis for the associations between clinicopathological factors and the risk IRG-based signature, a high-immune RS was linked to a high tumor stage and T stage. Also, the signature predicted the possible clinical features of HNSCC subjects, likely by regulating the tumor immune microenvironment. The tumor-infiltrating immune cells are known to be correlated with the progression and OS of HNSCC subjects [37], and a high level of infiltrating immune cells is often a good predictor for the OS of patients [38, 39]. Therefore, the risk IRG-based signature is expected to correlate with infiltrating immune cells. As expected, low-risk HNSCC patients had increased infiltration rates of 23 immune cells, indicating the effectiveness of immunotherapy in the low-risk category compared with that in the high-risk category. Owing to the importance of chemotherapy in HNSCC, the IC50 values of various chemotherapeutic agents were compared in the two groups. A lower RS was linked to a higher IC50 of QS11 and Salubrinal in both TCGA and GSE27020 databases. The QS11 is an inhibitor of ADP-ribosylation factor GTPase-activating protein 1, which modulates Wnt/β-catenin signaling through an effect on protein trafficking [40]; and the Salubrinal is a selective cell complex inhibitor that inhibits endoplasmic reticulum stress-mediated apoptosis. Despite these two drugs are not commonly used as chemotherapy drugs for HNSCC, the findings of this study may be valuable for future research.

This study had some limitations. (i) Owing to limited HNSCC samples in TCGA and GSE27020 databases, an issue of the time period was evident in this study. (ii) The analyses were performed using publicly available data from retrospective studies, and the outcomes must be validated in further research with larger samples and functional experiments. (iii) There is a need for further exploration of other possible predictive factors linked to clinical outcomes in HNSCC. (iv) There is a need for further investigation of the mechanisms underlying the functions of the IRG-based signature in HNSCC. Bioinformatics analysis with a specific reference value was used as a basis to conclude this study. Further corresponding molecular experiments are required to validate these findings.

To conclude, a risk signature based on seven IRGs (DKK1, GAST, IGHM, IL12RB2, SLURP1, STC2, and TNFRSF4) was developed. This signature serves as a potential biological marker and treatment target for immunotherapy and chemotherapy of HNSCC. These findings may facilitate future studies on the molecular mechanisms of HNSCC.

Data Availability

The datasets analyzed during the current study are available in TCGA database (https://portal.gdc.cancer.gov/), ImmPort database (https://immport.niaid.nih.gov/), GEO database (https://www.ncbi.nlm.nih.gov/gds/), TIMER (https://cistrome.shinyapps.io/timer/), STRING biological database (https://string-db.org/), GEPIA database (http://gepia.cancer-pku.cn/), and HPA database (https://www.proteinatlas.org/).

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this article.

Acknowledgments

This work was supported by the Research Program of Shenzhen Innovation Council (grant number JCYJ20200109140208058), Shenzhen Fund for Guangdong Provincial High-level Clinical Key Specialties (grant number SZGSP008), and Sanming Project of Medicine in Shenzhen (grant number SZSM202111012, Oral and Maxillofacial Surgery Team, Professor Yu Guangyan, Peking University Hospital of Stomatology). The authors thank all data providers, patients, investigators, TCGA database, ImmPort database, GEO database, STRING biological database, GEPIA database, HPA database, and institutions involved in these studies.

Supplementary Materials

Supplementary Figure 1: immune-related biological functions. (Supplementary Materials)