Abstract

Epithelial-mesenchymal transition (EMT) can promote carcinoma progression by multiple mechanisms; many studies demonstrated the invasiveness of pancreatic adenocarcinoma (PAAD) associated with the EMT, but how it acts through an lncRNA-dependent manner is unknown. Here, we investigated 146 samples from The Cancer Genome Atlas (TCGA) and 92 samples from the International Cancer Genome Consortium (ICGC). By gene set variation analysis (GSVA) and weighted correlation network analysis (WGCNA), we explored the EMT-related long noncoding RNAs (EMTlnc). Then, we performed univariate Cox regression analysis to screen their prognostic value for PAAD. The least absolute contraction and selection operator (LASSO) Cox regression was used to establish EMT-related lncRNA prognostic signal (EMT-LPS). In addition, we established a competitive endogenous ceRNA network. Then, we identified 33 prognostic EMTlnc as prognostic lncRNAs and established an EMT-LPS which showed strong prognostic ability in stratification analysis. By corresponding risk scores, patients were divided into low-risk and high-risk subgroups. Principal component analysis (PCA) showed that these subgroups had individual EMT status. Enrichment analysis showed that in the high-risk subgroup, biological processes, pathways, and hallmarks related to malignant tumors are more common. What is more, we constructed a nomogram that had powerful ability to predict the overall survival rate (OS) of PAAD patients in two datasets. So, EMT-LPS are a principal element in PAAD’s carcinoma progression and may help us in choosing the way of prognosis assessment and provide some clues to design the new drugs for PAAD.

1. Introduction

Pancreatic adenocarcinoma (PAAD) is a malignant disease with poor prognosis, and the cure rate for this disease is only 9%, moving to be the third principal cause of cancer death; if untreated, the median survival was only 3 months of those patients with metastatic disease [1]. Thus, it is very urgent to search for more therapeutic targets for PAAD.

Long noncoding RNAs (lncRNAs) are a ribonucleic acid molecule with a length of more than 200 nucleotides, which does not have any protein coding function and participates in various cellular processes. An accumulating body of research indicates that the dysregulation of lncRNA expression is implicated in many kinds of cancer [2], including proliferation, apoptosis, migration, and invasion [35]. Moreover, lncRNAs can also act as diagnostic or prognostic markers in a range of cancer types, for an example, hepatocellular carcinoma and prostate cancer [68].

Epithelial-mesenchymal transition plays a prominent role in the formation of body plan and the differentiation of multiple tissues and organs. It is defined as the phenotypic transition from an epithelial state to a mesenchymal state in the morphological cell program. Study reveals that EMT can not only cause organ fibrosis but also promote carcinoma progression [9]. It involves a complex network including epigenetic modifications, transcriptional control, alternative splicing, protein stability, and subcellular localization [1012]. Given that EMT are documented in many cancer cell models, the importance of EMT in cancer progression and the correlation in cancer tissues are still a matter of controversy. A multitude of researches have proven the invasiveness of PAAD associated with the EMT [1316], while during the PAAD progression, how it works in an lncRNA-dependent way is unclear.

In our research, relying on The Cancer Genome Atlas (TCGA) dataset () and the International Cancer Genome Consortium (ICGC) dataset () and through the bioinformatics and statistical analysis of the PAAD patient data, the prognostic significance of the EMTlnc is determined. The results showed that 33 EMTlnc had prognostic value for PAAD patients in these two datasets. What is more, on the strength of the ability of 33 EMTlnc, we constructed an EMT-dependent predictive sign of lncRNA (EMT-LPS) that can predict the OS of PAAD patients. In the meantime, we found that in patients with PAAD, there exist different prognoses between low-risk and high-risk subgroups, and tumor features are more common in high-risk subgroups. Finally, we establish an accurate nomogram to predict the OS for PAAD patients, and a ceRNA network to search for target miRNA and mRNA was built.

2. Materials and Methods

2.1. Raw Data Acquisition

For the training set, mRNA expression files (Fragments Per Kilobase of transcript per Million mapped reads (FPKM) normalized) and the corresponding clinicopathological data were obtained from the Genomic Data Commons Data Portal (https://portal. http://gdc.cancer.gov/). For the validation set, the RNA-seq profile and associated clinicopathological data were downloaded from UCSC Xena Database (http://xena.ucsc.edu/) and the International Cancer Genome Consortium Data Portal (https://dcc.icgc.org/). For the purpose of reducing statistical bias, we excluded the PAAD patients with missing OS values or . Finally, 146 PAAD patients from TCGA were selected to construct the EMT-LPS, and 92 PAAD patients from ICGC were included to test the EMT-LPS. Then, the FPKM data underwent a log2 transformation. The gene annotation file “gencode.v22.annotation,” which was downloaded from TCGA database, was utilized to transform ENSEMBL ID to GENE SYMBOL. On the basis of the GENCODE website (https://www.gencodegenes.org/human/), 9 types of transcripts (3prime_overlapping_ncRNA, antisense, bidirectional_promoter_lncRNA, lincRNA, macro_lncRNA, non_coding, processed_transcript, sense_intronic, and sense_overlapping) were defined as lncRNAs. The ENSEMBL ID with the max average expression level was used to represent the expression level of this gene when there were multiple ENSEMBL ID annotated to the same gene symbol. Ultimately, in TCGA cohort, there were 14805 lncRNAs and 19712 mRNAs, and in the ICGC cohort, the data were 1440 lncRNAs and 15146 mRNAs.

2.2. EMTlnc Acquisition by GSVA and WGCNA

The HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION pathway, which contained 200 EMT-related genes, was obtained from the package “msigdbr” [17, 18]. The EMT pathway values for each PAAD patient were calculated using gene set variation analysis (GSVA) in TCGA cohort. Then, weighted gene coexpression network analysis (WGCNA) was applied to find significantly relevant lncRNA binding with lncRNA modules and find the relationship between the modules. Here, to provide a scale-free network, 50 were set to minimum cut size, with combined with similar height modules. With different lncRNA modules marked with different colors, the gray modules represent lncRNA that cannot be combined; then, we applied the Pearson correlation analysis to evaluate the correlation of lncRNA and EMT values in each module. Finally, the lncRNAs in the model with and value < 0.05 were defined as EMTlnc.

2.3. Establishment and Verification of EMT-LPS

The univariate Cox was used to choose prognostic EMTlnc in both TCGA cohort and ICGC cohort based on EMTlnc. Through taking intersection, 33 lncRNAs with were defined as shared prognostic EMTlnc. Then, we used the LASSO Cox regression to select the significant prognostic lncRNAs and construct EMT-LPS involved 11 EMTlnc by the “glmnet” package [19] in R. Here, to facilitate parameter selection, the“10-fold cross-validation” approach was applied. The risk score of TCGA cohort and ICGC cohort patients was calculated as the following formula:

where is the estimated regression coefficient obtained from LASSO Cox regression analysis and is the expression value of each selected EMTlnc. PAAD patients were classified into low-risk and high-risk groups based on the median risk score. The OS results were then compared using the log-rank test and Kaplan-Meier survival analysis. To verify the accuracy of the identified EMT-LPS, the receiver operating characteristic (ROC) curve analysis in the package “Survival ROC” was applied [20].

2.4. Stratification Analysis

The entire TCGA cohort was stratified by age (≥60 () or <60 ()), gender (female () or male ()), grade (G1+G2 (), G3+G4 ()), and TNM stages (T1+T2 (), T3+T4 (), N0 (), N1 ()), while the ICGC cohort was divided into different subgroups (age (≥60 years () or <60 years ()), gender (female () or male ()), and TNM stages (T1+T2 (), T3+T4 (), N0 (), N1 ())). We applied the Wilcoxon rank sum test to compare the risk score of different stratification cohorts with the package “ggpubr” (https://cran.r-project.org/web/packages/ggpubr). The risk score formula obtained in TCGA cohort was applied to calculate the risk score of each PAAD patient in each stratified cohort and then divide them into the low-risk group and high-risk group. By Kaplan-Meyer survival analysis, the differences of OS results between the low-risk group and high-risk group were compared by the logarithmic rank test.

2.5. Principal Component Analysis (PCA) and Nomogram Construction

Based on the expression of 200 EMT-related genes in the HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION pathway, we used PCA to evaluate the differences between low-risk and high-risk subgroups and constructed a nomogram using the R package “RMS” [21] to evaluate the 1-, 2-, and 3-year survival possibility of patients in the cohorts. We then generated the calibration curve of the nomogram and evaluated the consistency between the predicted value and the actual observed value through the rms package.

2.6. Construction of the ceRNA Network

On the basis of the threshold and , the differentially expressed genes (DEGs) between low-risk and high-risk subgroups were identified by the Wilcoxon rank sum test through TCGA cohort. 11 target miRNAs of EMTlnc in the miRcode database (http://www.mircode.org/) were predicted and analyzed by using the Perl programming language. Then, in miRTarBase (http://mirtarbase.mbc.nct.edu.tw/PHP/index.php), miRDB (http://mirdb.org/), and TargetScan databases (http://www.targetscan.org/), we found the shared target mRNA of these miRNAs, crossed with DEGs, and got differential expression. Finally, we drew the ceRNA network through the “Cytoscape” software [20].

2.7. Enrichment Analysis

We inputted differentially expressed genes between low-risk and high-risk subgroups and targeted mRNAs differentially expressed in the ceRNA network into the “Metascape” website (http://metascape.org/) for function and pathway enrichment analysis, including typical pathway, Reactome gene set, Gene Otology (GO) biological process, and Kyoto Gene and Genome Encyclopedia pathway (KEGG pathway). In addition, in order to study which tumor features are more common in high-risk subgroups, GSEA software (http://software.broadinstitute.org/gsea/index.jsp) was used.

2.8. Statistical Analysis

In this study, the statistical analysis of all datasets is carried out with the R programming language (4.0.0). Kaplan-Meier, log rank test, and univariate Cox regression were applied for survival analysis based on the expression of EMT-related lncRNAs contained in EMT-LPS. Univariate and multivariate Cox regression analyses were used to evaluate the independent predictive value of EMT-LPS for OS prognosis.

3. Results

3.1. Identification of EMTlnc in PAAD Patients

First, 14805 lncRNAs from TCGA dataset and 1440 lncRNAs from the ICGC dataset were identified for the next analysis. Then, we analyzed the genes related to the EMT pathway in TCGA PAAD samples by GSVA and got the GSVA variation. After applying WGCNA on the database of PAAD patients in TCGA, the coexpression network by WGCNA analysis showed that the EMT-related lncRNA models in PAAD samples were grouped into 48 models, containing MElightcyan and MEturquoise models with and value < 0.05, which were both included into ulterior analysis in this research. The lncRNAs in those two models were defined as EMTlnc.

With the combination of the prognostic information, we implement univariate Cox regression () in two datasets, screening out the prognostic reactions associated with EMT from the EMTlnc. Ultimately, in these two datasets, by cross-analysis, we found that EMTlnc was significantly correlated with the OS of PAAD patients. The workflow is shown in Figure 1(a), and WGCNA in the PAAD sample is shown in Figure 1(b). The Pearson correlation analysis showed the correlation between the membership in the light cyan module and membership of characteristic genes in the light cyan module in the EMT pathway and the Pearson correlation analysis showed the correlation between the membership in the turquoise module and characteristic genes in the turquoise module in the EMT pathway as shown in Figures 1(c) and 1(d), respectively. The univariate Cox analysis results of 33 EMTlnc are shown in Table 1.

3.2. Construction of the EMT-LPS in TCGA Dataset and Validation of the EMT-LPS in the ICGC Dataset

In TCGA cohort, on the basis of 33 EMT-related prognostic lncRNAs, we performed a LASSO Cox analysis to build the EMT-LPS for the prediction of PAAD patients’ OS, and coefficients containing 11 EMT-LPS were generated (Figures 2(a) and 2(b)). The EMT-LPS included 11 lncRNAs, and the risk assessment is calculated from the coefficient of each lncRNA (Figure 2(c)). Then, according to the median risk score, the patients were divided into low-risk and high-risk subgroups. The Kaplan-Meyer survival curve showed that PAAD patients with poor clinical outcomes usually have higher risk scores, which means lower OS incidence and shorter OS time (Figure 2(d)). Figure 2(f) plots the risk score and survival status. The ROC curve (Figure 2(h)) showed that EMT-LPS had a strong ability to predict OS in TCGA cohort (, 0.86, and 0.9 for 1, 2, and 3 years, respectively).

To verify the predictive ability of EMT-LPS, we calculated the same formula for these patients in the ICGC cohort. Based on the average risk score, we divided PAAD patients into low-risk and high-risk groups. The results were consistent with TCGA dataset: in the ICGC dataset, patients with higher risk scores had lower OS rates and shorter OS time (Figure 2(e)). Figure 2(g) shows the assessment of risk and survival status distribution, which indicated that the total survival time and death status were shorter in patients with the higher risk score. ROC analysis also showed that EMT-LPS has a powerful predictive value for PAAD patients (, 0.85, and 0.89 for 1, 2, and 3 years, respectively; Figure 2(i)). These results comprehensively implied that the EMT-LPS had a strong and stable OS prediction ability.

3.3. Prognostic Analysis of the Eleven EMTlnc

There were eleven EMTlnc in the EMT-LPS, and to evaluate their prognostic roles, univariate Cox regression analysis was performed. In the forest plot, we can see that PP7080, xxbac-B135H6.15, RP5-1085F17.3, DANCR, AC096772.6, and LINC01128 are protective factors (HR: ), while LINC01116, UCA1, and RP11-400N13.3 are risk factors in EMT patients (, Figure 3(a)). Heatmap (Figure 3(b)) indicates that with the increase in the risk score, RP11-55418.2, RP11-400N13.3, UCA1, LINC00152, and LINC01116 expression levels were also increased; however, the expression of the AC096772.6, xxbac-B135H6.15, LINC01128, RP5-1085F17.3, PP7080, and DANCR was decreased. Their expression levels were also connected with PAAD’s clinicopathological features, including N_stage, T_stage, gender, age, and WHO grade (Figure 3(b)). Kaplan-Meier survival curves verified the high expression of AC096772.6, xxbac-B135H6.15, LINC01128, RP5-1085F17.3, and PP7080, and the low expression of RP11-55418.2, RP11-400N13.3, UCA1, LINC00152, DANCR, and LINC01116 was related to better OS in TCGA dataset (Figures 3(c)3(m)). In the ICGC dataset, the heatmap (Supplementary Figure S2C) also showed that RP11-55418.2, RP11-400N13.3, UCA1, LINC00152, and LINC01116 expression increased with the increase in the risk score, while the expression of the AC096772.6, xxbac-B135H6.15, LINC01128, RP5-1085F17.3, PP7080, and DANCR decreased with the increasing risk score. Their expression levels were connected with PAAD’s clinicopathological features, such as N_stage, T_stage, gender, and age.

3.4. Stratification Analysis of the EMT-LPS

We tried to determine if the clinicopathological features were related to the risk score or not. In TCGA dataset, the results showed the PAAD patients whose WHO grade at III+IV were in a higher risk scores, but the risk score had no association with age, gender, N_stage, and T_stage (Figures 4(a)–4(e)). To have a better assessment on the prognostic ability of EMT-LPS, we conducted a stratification analysis to verify if it retains the ability to predict OS in different subgroups. On the contrary to the PAAD patients with lower risk, patients with higher risk had worse OS when they were ≥60 years old (Figures 4(f) and 4(g)). In the same way, we proved that EMT-LPS can predict the OS of female or male PAAD patients (Figures 4(h) and 4(i)), patients with grade I+II or grade III+IV, and patients with TNM stage N0, N1, or T3+T4 (Figures 4(j)–4(n)). In the ICGC dataset, the results showed that PAAD patients with N1_stage, T3_stage, and T4_stage had higher risk scores (Supplementary Figure S2D–G). These data testified that the EMT-LPS might be a potential predictor of PAAD patients.

3.5. Principal Component Analysis

We conducted principal component analysis (PCA) on the basis of the expression value of the 200 EMT-related genes to evaluate the differences between the low-risk and high-risk groups (Supplementary Figure S5A, B). The results showed that the low-risk and high-risk patients were distributed in distinct directions in both TCGA and ICGC datasets. These results may suggest that different risk subgroups have different EMT statuses.

3.6. Pathway and Process Enrichment Analysis and Gene Set Enrichment Analysis (GSEA)

We identified 710 differentially expressed genes (DEGs) ( and ) between two subgroups in TCGA cohort to study the potential biological process and pathway concerning molecular heterogeneity between the low-risk and high-risk subgroups. These DEGs were mainly focused on the following aspects: chemical synaptic transmission, neuronal system, membrane potential regulation, behavior, projection morphogenesis of plasma membrane-binding cells, and GABAergic synapse (Supplementary Material S1A). GSEA showed that the interferon alpha response and the interferon gamma response (two tumor hallmarks) were enriched in the high-risk subgroup (Supplementary Material S1B, C). These results may be helpful to elucidate the cell biological effects of EMT-LPS.

3.7. EMT-LPS Was an Independent Prognostic Factor in PAAD Patients

We used univariate and multivariate Cox analyses to evaluate whether EMT-LPS was an independent prognostic factor in PAAD patients. The former showed that EMT-LPS was significantly correlated with OS based on TCGA dataset (hazard ratio (HR): 3.832, 95% CI: 2.435-6.030, ; Supplementary Material S2A). The latter showed that EMT-LPS was an independent predictor of OS (HR: 3.573, 95% CI: 2.248-5.681, ; Supplementary Material S2A). This conclusion was verified in the ICGC dataset. It was confirmed that EMT-LPS was an independent predictor of OS of PAAD patients in the ICGC dataset (univariate: HR: 2.667, 95% CI: 1.923-3.697, ; multivariate: HR: 2.787, 95% CI: 1.945-3.94, ; Supplementary Material S2B). The results above proved that EMT-LPS was an independent prognostic indicator and of great significance for clinical prognosis evaluation.

3.8. Construction and Validation of the Nomogram Based on EMT-LPS

By using risk status (based on EMT-LPS), gender, age, T_stage, N_stage, and WHO grade in TCGA dataset, we constructed a nomogram to create a clinically applicable quantitative tool to predict the OS of PAAD patients and tested in the ICGC dataset (Supplementary Material S2C). Calibration plots indicated that 1-, 2-, and 3-year OS ratios are completely consistent with the predicted ratios in these two cohorts (Supplementary Material S3A-C and Supplementary Material S4B-D, respectively). Then, we used the time-dependent ROC curves to evaluate the predictive ability of the nomogram and other predictors in TCGA (risk score, gender, age, T_stage, N_stage, and WHO grade, Supplementary Material S3D-F and Supplementary Figure S1E-G, respectively); the results showed that the nomogram had excellent accuracy in 1-, 2-, and 3-year OS (AUC of TCGA was 0.79, 0.83, and 0.86; AUC of ICGC was 0.8, 0.84, and 0.89). These data suggested that the nomogram has strong and stable capabilities to predict the OS for PAAD patients.

3.9. Construction of the ceRNA Network and Functional Enrichment Analysis

Based on the EMTlnc, we established a ceRNA network to have a deeper insight on how EMTlnc regulates mRNA expression by sponging miRNAs in PAAD. Two of 12 lncRNAs were extracted from the miRcode database, and 13 pairs of interactions between 2 lncRNAs and 34 miRNAs were identified. Then, we used three databases (miRTarBase, miRDB, and TargetScan) to search for target mRNAs on the basis of the 34 miRNAs, and 1539 mRNAs were identified in these three databases.

Furthermore, these target mRNAs were crossed with DEGs to obtain differentially expressed target mRNA. Finally, two lncRNAs, twelve miRNAs, and thirteen mRNAs were included in the ceRNA network (Figure 5(a)). Additionally, functional enrichment in the network tool using 1539 mRNA targets revealed that these genes are enriched in vascular system development, pathway in cancer, regulation of cellular response to stress, Wnt signaling pathway, tissue morphogenesis, insulin signaling, regulation of cellular protein localization, response to growth factor, and negative regulation of cell differentiation (Figures 5(b)5(d)). These data can provide some clues for us to discover the potential functionality of EMTlnc in PAADs.

4. Discussion

In this study, 268 PAAD patients in TCGA and ICGC datasets were included to explore the prognostic significance of EMTlnc. In these two datasets, 33 EMTlnc were proven to have predictive value and 11 EMTlnc among them were used to establish EMT-LPS to predict PAAD OS. On the basis of the median risk score, we divided PAAD patients into low-risk and high-risk subgroups and found that the latter group had worse clinical outcomes, richer tumor characteristics, and exact malignant-related pathways. Multivariate Cox regression analysis revealed the EMT-LPS was an independent risk factor for OS.

In addition, we set up a nomogram of EMT-LPS combined with gender, age, and World Health Organization grade, T_stage and N_stage, which has a strong predictive ability for OS of PAAD patients on TCGA and ICGC datasets. Finally, we established a ceRNA network that includes two EMTlnc, twelve miRNAs, and thirteen mRNAs for observing the latent functions of these EMTlnc.

More and more evidences show that lncRNAs can coordinate multiple cell processes by regulating EMT in different types of cells. MALAT1 and lnc-ATB can stimulate the EMT process by competitively binding miR-503 and miR-200c and then promote silica-induced pulmonary fibrosis [22, 23]. In EMT of the breast, bladder, and nasopharynx, lncRNA ROR can regulate various signal pathways [2426]. More importantly, by inducing EMT to accelerate the growth and progression of cancer, hypoxia can enhance the mutual movement of lncRNA UCA1 into the exon-mediated bladder cancer cells [27].

Studies had revealed that EMT had impact on cancer invasion and progression, and lncRNAs may play the role of ceRNAs, targeting EMT regulators so as to influence the aggressive progression of the tumor. Liu et al. [28] found that TGFBI acts as the ceRNA of miR-21 and FN1 acts as miR-200c to regulate EMT. What is more, the richness of ceRNA can determine the reversibility of EMT. By the way, it was reported that the mutations in BRCA2 significantly increase the risk of pancreatic cancer [29]; whether it is associated with EMT is a question that needs to be further explored. miR-330 might be of potential therapeutic value for pancreatic cancer pain because it participated in the genesis of pancreatic carcinoma-induced pain hypersensitivity [30]; perhaps we can explore the mechanism of pancreatic cancer pain from the perspective of EMT. Here, we implemented a ceRNA-based functional enrichment analysis and found that genes were mainly enriched in vascular system development, pathway in cancer, cell stress response regulation, Wnt signaling pathway, and some other pathways. Taking all these evidences into consideration, we bet that EMT is targeted at lncRNAs, and to identify potential prognostic markers or therapeutic targets, we should better place more emphasis on the interactions and functions of lncRNAs and EMT.

From 268 PAAD patients, we identified 33 EMT-related prognostic lncRNAs, eleven of which were included in the EMT-LPS. Yang et al. uncovered that RP11-400N13.3 reacts as an oncogenic lncRNA in colorectal cancer, and by modulating the miR-4722-3p/P2RY8 axis, it can accelerate colorectal cancer progression [31]. LINC00152 was firstly found overexpressed in gastric cancer and acted as an oncogene in gliomas and liver, lung, and colorectal cancer [3234]. It can promote cell proliferation and invasion capability of these cancer cells by targeting GFR, EZH2, miR-16, and miR-139-5p [3539]. Wang et al. confirmed that the overexpression of LINC01116 was related to the clinicopathological characteristics and survival in glioma patients and can accelerate tumor proliferation and neutrophil recruitment by regulating IL-1β [40]. Li et al. found that LINC01128 can resist acute myeloid leukemia by regulating miR-4260/NR3C2 [41]. Tang et al. reported that miR-135a can downregulate DANCR by regulating the downstream of NLRP3 in pancreatic cancer [42], which is consistent with our results.

Some of the 11 lncRNAs have been reported to be related to cancer progression, but little is reported about PAAD, let alone how lncRNAs interact with EMT-related genes. Hence, we hope that our results will help determine the predictive responses that EMT regulators may target, thus providing insight into its potential role in carcinogenesis and PAAD development.

In a word, this study included two PAAD datasets (TCGA and ICGC datasets) through which our results have been obtained and validated, but there also are some limitations. In the future, we should use more independent PAAD cohorts to verify the established prognostic EMTlnc, design new drugs or strategies to manage PAAD, and advocate more research to reveal the specific mechanism and genes of PAAD regulating EMT.

5. Conclusions

EMT-LPS played an important role in PAAD’s carcinoma progression and may help us in choosing the way of prognosis assessment and provide some clues to design the new drugs for PAAD.

Data Availability

The datasets during the current study are available from the corresponding author upon reasonable request.

Disclosure

This study was posted as a preprint in “research square” according to the following link: https://www.researchsquare.com/article/rs-613873/v1 [43].

Conflicts of Interest

The authors declare that there are no conflicts of interests.

Authors’ Contributions

Hongwei Zhu and Jixing Liu were responsible for the conceptualization. Shuai Wang and Hai Hu were responsible for the methodology. Shuai Wang and Hai Hu were responsible for the software. Yanyao Deng, Hai Hu, and Le Xiao were responsible for the validation. Shuai Wang, Hai Hu, and Ting Cai were responsible for the formal analysis. Jixing Liu was responsible for the resources. Ting Cai and Wenzhe Gao were responsible for the data curation. Yanyao Deng and Hai Hu prepared the original draft. Le Xiao and Ting Cai reviewed and edited the paper. Shuai Wang was responsible for the visualization. Wenzhe Gao was responsible for the supervision. Hongwei Zhu and Jixing Liu were responsible for the project administration. Yanyao Deng and Hongwei Zhu were responsible for the funding acquisition. All authors have read and agreed to the published version of the manuscript. Yanyao Deng and Hai Hu have contributed equally to this work.

Acknowledgments

This research was funded by Hunan Provincial Science and Technology Plan Project (grant number 2019JJ80066), Scientific Research Project of Health and Family Planning Commission of Hunan Province of China (grant number B20-17202), National Natural Science Foundation for Young Scholars of China (grant number 82000614), Natural Science Foundation of Hunan Province, China (grant number 2020JJ5876), and Science and Technology Project of Changsha, Hunan, China (grant number kq2004146).

Supplementary Materials

Supplementary Material S1: functional analysis of 710 differentially expressed mRNA (DEms) between low-risk and high-risk subgroups. Supplementary Material S2: EMT-LPS was an independent prognostic factor for PAAD patients. Supplementary Material S3: construction of EMT-LPS calibration in TCGA dataset. Supplementary Material S4: nomogram and calibration based on EMT-LPS in the ICGC dataset. Supplementary Material S5: (A, B) Principal component analysis of EMT-LPS in TCGA and ICGC datasets. (C) Heatmap of the relationship between the expression levels of eleven EMT-related lncRNAs and clinicopathological features in the ICGC dataset. (D-G) Stratification analysis of EMT-LPS in ICGC patients with different clinical characteristics. (Supplementary Materials)