Abstract
China’s cultural classics have high artistic and ideological values in the world, which implies China’s historical heritage and the inheritance of the cultural situation of the Chinese nation for thousands of years. At present, with the rapid development of China’s economy, especially in the development environment of cultural globalization, China’s traditional cultural classics have attracted worldwide attention along with historical and cultural treasures. In fact, other countries need to translate cultural classics out of their love for China’s cultural classics or academic research. However, there are a large number of cultural classics in China. According to relevant data, there are approximately 35,000 kinds of cultural classics, out of which only 0.2% are translated into foreign languages. In order to make China’s excellent cultural classics known to the world and let more people in other countries understand Chinese culture, this paper studies the translation of cultural classics through in-depth learning. Firstly, this paper introduces the basic concept of deep learning and proposes an algorithm that deeply studies the convolution layer and pool layer. Secondly, the algorithm establishes a deep learning model that calculates and counts the text information of ancient books through explicit intertextuality. Thirdly, the model carries out automatic text translation and lists the analysis process of cultural classics translation based on intertextuality, so as to study cultural classics translation. The obtained results can promote the development of cultural classics translation in China. For the translation of complex cultural classics, the effects of the three models are tested experimentally, of which the sequence model (seq) is the best. This model is fast and simple to extract the text of literary classics.
1. Introduction
The rapid development of cultural classics translation has attracted the attention of experts and scholars at academia, industries, and research organizations, and it studies the translation theory of cultural classics from the perspectives of various dimensions, including “culture,” “language,” and “communication” [1]. In fact, there are many contents involved in the translation of cultural classics. Since the text in China’s language and culture is composed of paragraphs, texts, words, and sentences, it is necessary to count, retrieve, and extract the contents of cultural classics before translation. Furthermore, these have been translated in the established cultural monolingual and parallel bilingual corpus [2]. At the same time, there are complex classical Chinese poetry and other contents in cultural classics and the amount of data to be analyzed are huge. Therefore, the state-of-the-art computational learning techniques such as machine learning, artificial intelligence, and deep learning can be used for similar data analysis and manipulations.
This paper uses a deep learning method to analyze the text data in cultural classics and suggests a learning algorithm that calculates the explicit intertextual data in cultural classics. Furthermore, the proposed algorithm compares the text data and finds the same number of words between two sentences. Subsequently, we use text translation index to realize automatic text translation. However, China’s classical translation is more complex, which can only be assisted by computer translation to further analyze the intertextuality between each sentence of the original text and the historical text in cultural classics, so as to judge the meaning of cultural classics in history and culture.
In this paper, we analyze the data of cultural classics based on the deep learning algorithm, compare the historical intertextuality of each sentence in cultural classics by using text translation index, and establish intertextuality matrix to analyze the content of cultural classics. Moreover, we analyze the sentences of cultural classics according to the intertextuality text translation index process and improve the spatial vector model based on the sequence model. Using this model, we can measure the intertextuality of the text in cultural classics, analyze the similarity between each extension, and improve the accuracy of the cultural classics translation. The main innovations of this study are as follows: (i) Analyze the data of cultural classics based on a deep learning algorithm, compare the historical intertextuality of each sentence in cultural classics by using a text translation index, and establish intertextuality matrix to analyze the content of cultural classics (ii) Analyze the sentences of cultural classics according to the intertextuality text translation index process and improve the spatial vector model based on the sequence model (iii) For the translation of complex cultural classics, the effects of three models are tested experimentally, of which the sequence model (seq) is the best
The remaining part of the paper is structured as follows: we discuss some of the recent state-of-the-art methods in Section 2. A deep learning-based algorithm is suggested in Section 3. We discuss the convolution neural networks and proposed a model for text translation. In Section 4, we discuss the translation of cultural classics based on the suggested deep learning algorithm. The obtained results are elaborated in Section 5. Finally, we conclude this discussion along with directions for future research in Section 6.
2. Related Work
The Republic of China began to translate Chinese cultural classics as early as 647 AD and translated the Tao Te Ching into Sanskrit, opening the door for Chinese cultural classics to go outside of China [3]. Chinese cultural classics were firstly translated into english and spreaded into west countries at the end of 16th century. In 1588, Manila, the Philippines, translated the “Mingxin Baojian” into Spanish, which opened the prelude to the western translation of Chinese cultural classics in China [4]. Li and Jiang provided a platform for westerners to better understand China in their Chinese series, which includes all aspects of history, language and literature, religion, politics, and geography and has become a link between Chinese and Western cultures [5]. Ricci and Luo Mingjian et al. translated Chinese cultural classics, both of which translated the “four books.” Up to now, Luo Mingjian's translations have been kept in national libraries such as in Rome and Italy [6].
In this period, all the Latin translations were hand-copied and only a few were printed. As the earliest western translation of cultural classics in China, the translation had little impact over the entire world. Thoreau, a famous American writer in the 19th century, read Chinese Confucian classics and absorbed the Confucian thought of “peace and contentment.” In the process of constructing American culture, Thoreau firmly believed in the thought of “peace and contentment” and began to question the popular materialism in the United States. Under the influence of Thoreau, the United States began to accept Chinese ecological literature and cultural classics [7]. Sun studied the influence of medieval classics on the development of Japan and spread China’s ancient classics to Japan through multiple channels and different ways. During the prosperous Tang Dynasty, Chinese classics were widely spread and distributed across Japan, directly affecting Japanese culture and education, politics, economy, literature, and art [8].
From the perspective of medical administration, Li and Teng studied the contents of medical administration in the Taiping imperial survey of the Song Dynasty, which fully reflects the contribution of ancient medical literature and history books to human medical development [9]. Pang and Hou introduced the Chinese traditional cultural classics to the whole world and let them feel the charm of Chinese cultural classics. In addition, they also put forward the concept of ecological translation and established the translation ecology of Chinese traditional cultural classics [10]. In the era of “Chinese literature going global,” Zhao, starting from the theory of translation multisubjectivity, combined with the view of Chinese context cultural poetics, chose Mencius as an example to fully reflect the multibody of different translators in China [11]. Starting from the development of guided reading education, Liu et al. integrated the Chinese cultural classics and used them in setting courses, recommending books, teaching materials, etc. In essence, this plays a guiding role and integrates science and technology, national history, and traditional culture as the development background of guided reading of Chinese cultural classics in the new era [12–14].
3. Deep Learning-Based Algorithm
3.1. Concept of the Convolutional Neural Network (CNN)
When exploring the deep learning algorithm, researchers analyzed it deeply on the basis of convolutional neural network [15]. Deep learning is a feedforward neural network with deep level and convolution operation. It has strong representation and fitting ability and can classify the input information according to the translation invariance of the hierarchical structure [16]. Through continuous and in-depth research, neural networks for different scenarios have appeared, namely, the RNN network and the CNN network. The two layers involved within the deep learning framework are the (i) convolution layer and (ii) pool layer.
3.1.1. The Convolution Layer
A basic component of the convolution neural network is the convolution layer. The function of convolution layer is to realize feature extraction. This method is similar to convolution in signal processing. It uses the small convolution kernel to slide in an image area, uses the convolution kernel on pixels in the corresponding area of the image, and outputs the results through linear superposition [17]. The following equation determines the convolution calculation formula:The physical meaning of the convolution is the result of multiple inputs in the superposition system acting together at a certain time. In formula (1), is the original pixel of the image and all pixels are combined into a complete image. Similarly, represents the action point, which is widely used as the definition of the convolution kernel. Superimposing two kinds of linearity is the convolution result [18]. Figure 1 shows that is the input content and is the convolution kernel. The two, i.e., I and K, interact to form the final result in the form of .

There are many convolution layers in the convolution neural network, and there are a large number of convolution units in the convolution layer. The purpose of convolution operation on each layer of the convolution neural network is to extract various types of features. The convolution in the front row is used to extract low-level features from the image, such as lines and edges, and deep convolution is used to extract more complex features from the image [19].
Similarly, the main function of the convolution kernel convolution operation is to extract image features, and different image blocks are obtained under the calculation of the convolution kernel. This paper chooses to extract image edge features, for example, analysis [20]. First, the first-order differential and second-order differential are described in detail, and the first-order differential is defined by the following formula:
The second-order differential is defined by the following formula:
The rotation invariant filter is used in the convolution neural network. Based on this filter, the fixed image filtering result can be obtained by rotating the image arbitrarily. The second-order differential operator is more sensitive to the edge image. Here, the rotation invariant differential operator is selected to explain [21, 22]. For two-dimensional images, the derivation process of the Prass operator can be described by the differential operator. First, the second-order differential of is calculated by the following formula:
The Laplace operator is obtained by summing the second-order differential of and the second-order differential of , which is similar to the convolution kernel. The Laplacian is a differential operator, which is used in the sudden change of image gray level. The sharpening result can be obtained by superimposing the original image and the operator.
3.1.2. The Pooling Layer
The convolution layer outputs the characteristic map from the convolution neural network. The pooling layer follows the convolution layer. The essence of the pooling layer is to select the characteristic process. The two common types of the pooling layer are (i) the maximum pooling layer and (ii) the average pooling layer. The following analysis description is related to the process of selecting the maximum pooling layer [23].
Let be the matrix, the elements in the matrix are represented by , the size of the convolution kernel is 2 × 2, and 1 is the step size of the pooling layer, and the pool operation is completed by the following formula:Here, and correspond to the calculation process of the pooling layer in the form of an image.
As shown in Figure 2, the analysis shows that the step size of the maximum pool layer is 2 and the filtering is . The result obtained by maximizing the area in the upper left corner of the operation diagram is 6, that is, the maximum value of the four numbers on the right in the diagram. According to the same principle, the other areas are 3, 4, and 8 and finally constitute the characteristics of the pooling layer.

The function of the pooling layer is to remove all redundant data and retain only key data, which can better deal with the problem of more parameters on the convolutional neural network. Due to more parameters, it will definitely increase the computational complexity. In addition, the pooling layer can enhance the spatial translation and the deformation and scale invariance. The essence of the pooling layer is the information filtering process. During this period, some data will be lost to optimize the computing performance.
3.2. Deep Learning-Based Model
When designing the traditional CNN model, it is necessary to increase the channels or layers to improve the performance. However, increasing the number of channels and layers will lead to more complex parameters that will be required during the training process. The increase of the number of network parameters will lead to the overfitting phenomenon of the network, as well as computational training time [24]. Therefore, the number of cells per layer is not ideal or the number of cells per layer is not increased.
In this paper, the sparse hierarchical structure is used to expand the learning feature interval. Moreover, the size of a single convolution scale is different, which can reduce the amount of information calculation and better express the sparse structure [25]. Compared with the receptive field, multiscale convolution can learn a variety of change characteristics according to the change of receptive field. Figure 3 shows that the structure on inception V1 is used here, and the size of convolutional kernel is 3 × 3, 1 × 1, and 5 × 5, which is used to generate the multiscale convolution layer. In this figure, constrained convolution is used to process backward 3 × 3, 1 × 1, and 5 × 5. Next, we feed them in the convolution layer and integrate the input characteristic map of each layer to complete the maximum pool operation and 7 × 7 convolution operation [26].

4. Translation of Cultural Classics Based on Deep Learning
4.1. Explicit Intertextuality Statistics
This paper studies the translation of cultural classics based on in-depth learning, compares the text information to find the explicit intertextuality, and selects a fixed measure to solve it. The higher level is implicit intertextuality. We should understand the meaning of the text from the context and the meaning of the text. It should be noted that intertextuality calculation can quantify the explicit intertextuality in cultural classics and, therefore, may provide relevant clues for understanding the implicit intertextuality in cultural classics.
When measuring the explicit intertextuality, the number of the same words between two sentences should be calculated. In addition, the measurement method should be used to deal with two sentences in two cultural classics. In the experiment, this paper selects the text classics “xucha Jing” and “Cha Jing” as the object, adopts a variety of vector similarity measures as text intertextuality for operation, and lists the measurement definitions, as shown in Table 1.
If a sentence is regarded as a vector composed of multiple words or phrases, the statistical value between each vector is the intertextuality measurement result between different sentences. Note that X and Y represent vectors, the corresponding length of each vector is represented by |x| and |y|, to represent the number of elements contained in the vector, and the common dimension of X and Y is represented by |XY|. This paper calculates the longest common subsequence in two sentences. The following is the process of calculating the longest common sequence [14].
Suppose that the intertextual matrix of X = {X1, X2, ..., Xm} statement and Y = {y1, y2, ..., yn} statement is as follows:
Based on the intertextual matrix DIJ of the above two sentences, the maximum matching subsequence between them is obtained by using the dynamic programming algorithm. The purpose of calculating the maximum matching subsequence is to find the ideal collocation between the words of two sentences and to find the elements without common columns and rows in the matrix to the greatest extent. The maximum matching subsequence is calculated by the dynamic programming algorithm, and the calculation formula is as follows:Here, represents the objective function of the dynamic programming and represents the similarity function.
4.2. Text Translation Index
Automatic text translation is very important compared with the translation of cultural classics, and the interpretation of classics is difficult. Here, we can make full use of computers to complete the auxiliary translation of cultural classics [27]. The text translation index is used to automatically select a highly referential original sentence on a fixed measurement standard. Similarly, intertextuality is used to comprehensively explain the correlation and influence between texts. Moreover, translation provides a coherent and accurate data reference. Each text has the function of guidance and reference to form intertextuality. In addition, there are many sentences with mutual reference and meme origin in the same text, so the content of text intertextuality mapping is mainly divided into two parts: (i) intertextuality within the text and (ii) intertextuality before the text. The two intertextual measurement types can provide an important basis for the understanding and translation of the original text. We provide further details on both text intertextuality mapping in a subsequent discussion: (i) Intertextuality between texts: two different texts need to calculate their intertextuality. It is assumed that there are n sentences in the original text and m sentences in the historical text. The intertextuality of the two sentences is combined to form the following intertextuality matrix : (ii) Intertextuality in the text: assuming that there are n sentences in the original text, the original text has some intertextuality, and the original text itself can also become an intertextuality matrix, which is shown as follows:
Each of the above-given elements represents the intertextuality between the basis in the original text and the basis in the historical text, and aij represents the intertextuality between the i sentence in the original text and the j sentence in the historical text. According to the intertextuality matrix, the sentence representing the original text in each line and the corresponding historical text index can be obtained. The elements of each line can be arranged according to the intertextuality level and the qualified queue output, that is, the translation index of the original sentence. According to the intertextuality matrix, the values of most elements are 0, so the matrix is a sparse matrix, which is saved in the form of linked list and an array, as shown in Figure 4.

Each node in Figure 4 represents the corresponding statement, where represents the target statement, that is, the translation statement. Furthermore, the node on the right represents the intertextuality index, or all reference nodes with intertextuality not 0. This should be kept in mind that the node that does not calculate the value of 0 can be ignored.
Note that is a diagonal element, and the result is always 1, which means that, at the same time, it cannot provide new understanding information. However, this value is not considered when queuing. Another feature of the matrix is based on the diagonal symmetry matrix . Since the similarity between sentences i and j in the original text is equal to that between sentences j and i, the value of the triangular matrix should be taken into account when solving and the amount of calculation can be reduced by half.
5. Research and Analysis of Cultural Classics Translation Based on Deep Learning
5.1. Analysis of Cultural Classics Translation Based on Deep Learning
The following steps illustrate the construction process of the cultural classics translation index based on in-depth learning:(1)Building a library and importing ancient books and documents(2)Preprocessing data in the literature(3)Dividing the document into retrieval units, usually sentences(4)Building the inverted index(5)Searching the function(6)Parsing the input sentence and then decomposing it into multiple retrieval units(7)Searching the inverted table according to the search unit in the input sentence(8)Calculating the intertextuality between the sentences in the S search set and the input sentences(9)Arranging the sentences in the search set according to the size of intertextuality, which becomes a list of search sets and is provided to users.
The intertextuality is calculated by the vector space model, and then, the improved model in the text is used to calculate the intertextuality for further comparative analysis. The evaluation results obtained by different models are calculated according to the ratio. In the evaluation, it is judged according to the recall and accuracy rates and then the F value is used to measure the values of the two parameters. The following are the basic definitions:
This paper sets up three groups of experiments. First, we use the vector space model (VSM) model to calculate a group of experiments (VSM) based on the weighted value method. Secondly, we use the improved lexical semantic similarity method to complete a group of experiments (sem). Finally, we use the sequence model to complete a group of experiments (seq). The results obtained using the three groups are shown in Figure 5:

According to the experimental results, as shown in Figure 5, the effect of the VSM benchmark test is ideal. Similarly, the result obtained by using the improved similarity calculation method (sem) is not significant. However, the sequence model (seq) has the best effect. By comparing the three models in Figure 5, it can be concluded that the vector space model can more simply represent the results obtained. For the problems in the VSM model, this paper calculates them by expanding the semantic information word similarity and analyzes them by using the sequence model.
5.2. Corpus Intertextual Translation Analysis
The texts of cultural classics show a crisscross intertextual relevance. The text language in cultural classics is relatively classical and concise, which contains China’s history, culture, and connotation [28]. Due to a large number of early intertextual texts in a target text, there are multidimensional recognition and compound diachrony, which increases the difficulty of translating cultural classics, and there will be defects or information errors during code switching. Therefore, when translating this kind of cultural classics with strong intertextuality, translators do not start from the original text but first analyze and then further translate on the basis of the previous intertextuality. During the long-term spread of the text, this will be affected by various factors and deformed. The translator should compare and analyze multiple versions. In addition, the texts of cultural classics are often cited by others, which are cited, infiltrated, and developed during their circulation.
Searching the corpus can help the translator complete the basic translation work. When translating cultural classics, intertextual markers are important elements in the text. Intertextual markers are shown in Figure 6.

Bilingual parallel corpus means that there are a large number of electronic databases of marked original texts and corresponding translated texts. The two texts can be kept corresponding to each other in terms of paragraphs, texts, vocabulary, and sentences. The tea classic and its translation can be established by extracting the above information through using the deep learning algorithm in the computer and then making statistics and retrieval classics of tea’s monolingual and parallel bilingual corpus of tea culture. After that, the monolingual corpus retrieval software is used to master the text characteristics of the English translation version of the book of tea, and the text information is counted by the deep learning algorithm. The specific information is listed in Table 2.
In Table 2, there are 128273120 tokens in the English translation version of the tea classics. The standard class character ratio (STTR) and the class character ratio (TTR) are 46.367 and 25.93, respectively. One thousand words are used as a unit to count the above data. The total number of paragraphs obtained when all the title information is retained is 340, and the number of pure paragraphs left after all the titles are removed is 314. It can be concluded that in the statistical data, the TTR and STTR values of the English translation version of the tea classic are low and there are many repeated words in the phrases or sentences. They are tested and searched manually, including a large number of professional terms related to the tea culture.
Since the total shape symbols in ancient Chinese are less than 1000, it is difficult to count them all. Starting from the language characteristics, the number of shape symbols in English is more than twice that in the Chinese language. It is concluded from the paragraphs that the translator intends to correspond to the paragraphs in the ancient Chinese book of tea, so the number of paragraphs is roughly the same. There is a very low repetition rate of category symbols in the book of tea. In the bilingual version, the category symbols with frequency 1 and frequency 2 account for 86% and 84% of the whole category symbols, respectively. Table 3 shows the proportion statistics of the low-frequency words and text.
6. Conclusions and Future Work
With the rapid development of the society and the rapid change of science and technology, the field of international translation has also been paid attention to, which has been gradually developed. Through the translation of cultural classics, we can understand the historical, cultural, and humanistic information of a particular country of interest. During this period, we should strengthen the translation quality, translation process, and translation environment of cultural classics. When studying the translation of cultural classics, this paper uses the deep learning algorithm analysis and establishes the mathematical model of intertextuality statistical measurement. Moreover, the paper constructs the text translation index. For the translation of complex cultural classics, the effects of three models are tested experimentally, of which the sequence model (seq) is the best. This model is fast and simple to extract the text of literary classics. Based on the analysis of the intertextual relevance features crisscross in the cultural classics, the text is divided into sentences, paragraphs, and words. Finally, they are judged by comparing the contents of the previous intertextual text, so as to improve the accuracy of the cultural classics translation.
In the near future, we will aim to create and build more impactful deep learning-based text identification and translation models, which will end in enriched exactness and correctness. Accordingly, deep learning approaches are very time-consuming, which are dependent over the amount of data, and as a result, training a model may take longer periods. Therefore, new strategies, such as data aggregation, can be created to improve algorithm performance in terms of training the model and reduced prediction durations. In addition to the CNN, other techniques like LSTM, graph convolutional network (GCN), ResNet, and even the impact of attention networks can be researched in the near future. Although the Relu function was employed in this research, we believe that alternative functions might provide different results. As a result, a good piece of future work will be to look at a variety of comparable functions.
Data Availability
Data are available on request to the corresponding author.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
The work was supported by the Hebei Social Science Fund Project: A Study of the Translation and Dissemination of the Classical Yanzhao Culture (HB20XW015).