Abstract
Aiming at the problems of low accuracy of English phrase part of speech recognition, poor English translation effect, and long translation time in the traditional English translation model, an English translation model based on intelligent recognition and deep learning is designed. An English phrase corpus was built, the phrase antecedent and postscript likelihood of the improved GLR algorithm by using the quaternion cluster were calculated, and the part of speech of the English phrase corpus was identified. According to the recognition results, the feature extraction algorithm is introduced to extract the best contextual features. On this basis, a neural machine translation model is constructed by integrating the traditional neural network in deep learning and combining the attention mechanism. It is used as a neural machine translation model for English translation. The simulation results show that the English translation model based on intelligent recognition and deep learning has high phrase recognition accuracy, good translation effect, and short translation time, which improves the quality of English translation.
1. Introduction
In recent years, with the continuous development of education and science and technology, the number of machine translation application products is also increasing. These applications are mainly concentrated in foreign language translation such as academic literature and search engine [1]. Therefore, machine translation technology has a huge market application demand and has a good development prospect. In the past, machine translation technology has more or less some disadvantages, such as the low accuracy of translation, which has become a huge bottleneck hindering the further development of machine translation technology [2]. For example, in Baidu and Google translation software, the translation results are quite different from the actual professional manual translation, indicating that the current level of English machine translation cannot meet the current needs [3]. Thanks to the development of big data, many researchers seek to complete translation through computer-aided translation (CAT) [4], so as to improve the accuracy of English translation and achieve real-time and accurate English translation. Therefore, they build English translation models to promote the further development of English machine translation technology. Therefore, it is of great practical significance to construct a new English translation model [5].
Reference [6] designs an intelligent recognition English translation model based on the improved GLR algorithm. The model constructs a phrase corpus with a tag size of about 740000 English and Chinese words to make the phrase searchable. By constructing the phrase structure through the phrase center, the part of speech recognition results can be obtained, and the English and Chinese structural ambiguity in the part of speech recognition results can be corrected according to the syntactic function of the analytical linear table. Finally, the identified content is obtained, so as to realize the design of the English translation model, but the model is difficult to accurately identify the part of speech of English phrases, resulting in the decline of subsequent English translation quality. Reference [7] designs a neural machine English translation model based on syntax. By establishing a multilayer neural network model, the unmarked text vocabulary is vectorized, the combination of vocabulary representation and vector features is realized, and the effective information of various sentences and semantics is extracted. Under the online ranking framework, the neural network is used for word ranking and scoring and to obtain the semantic information of the sample data and predict the difference of word order, so as to complete the design of the English translation model. However, the accuracy of part of speech recognition of English phrases is low, resulting in a poor English translation effect. Reference [8] designs an English translation model based on the integration of syntactic features. The model combines the translation method based on language template and the statistical translation method based on conditional random field to segment and process long sentences through syntactic dimension and statistical dimension, so as to improve the quality of machine translation. However, the implementation process of the model is too complex, which leads to the increase of English translation time and the decrease of translation efficiency. Reference [9] designs an English translation model based on the feature extraction algorithm, selects the optimal translation solution by introducing the feature extraction algorithm, constructs a semantic mapping model for interactive optimal translation of English-Chinese translation, and designs the English translation model. However, when the model is applied to practice, it is found that this model has the problem of long English translation time. Therefore, its translation efficiency has been maintained at a low level.
After summarizing the above literature and existing problems, it is found that the semantics contained in the phrase of a sentence is usually the core content of the sentence in the process of translation. The intelligent recognition of the phrase is an important link in language recognition. Its principle is to identify and summarize the phrases in the sentence and then analyze the part of speech and syntax of the phrase. The translation and automatic combination are carried out according to the phrase corpus, and the translation results of the original sentences are finally obtained. In the field of machine translation, phrase intelligent recognition is the key technology, which can meet the tone selection of translation samples and the accurate alignment of parallel corpora. Using phrase intelligent recognition technology can effectively reduce grammatical ambiguity. Deep learning is a machine learning method. The main content of machine learning research is the algorithm of generating models from a large amount of data, that is, learning algorithm. With appropriate learning algorithms, computers can generate models based on these data. When facing new situations, the model will give us appropriate judgments according to the situation. The advantages of deep learning are the following: (1) compared with the shallow learning structure, the deep structure can use fewer parameters to approximate a more complex mapping relationship and (2) the intermediate representation obtained from one learning task can be applied to another task, which is a multitask learning method. The use of deep learning improves the operation speed, and processing performance is more suitable for machine translation tasks and provides a new way in the field of intelligent machine translation. Based on this, this paper designs an English translation model by combining intelligent recognition and deep learning, in order to alleviate the disadvantages of structural ambiguity in the current field of English translation, improve the efficiency of phrase recognition, and enhance the quality of English translation.
2. Design of English Translation Model Based on Intelligent Recognition and Deep Learning
2.1. Construction of Phrase Corpus
Corpus plays an important role in the intelligent English translation model. Storing bilingual phrase data in corpus can accurately label the part of speech of short words in Chinese and English, standardize the function of each phrase, and greatly improve the accuracy and timeliness of the phrase automatic recognition algorithm in English-Chinese machine translation and help English-Chinese machine translation more accurately [10]. As we all know, the common English-Chinese machine translation is to convert long sentences into multiple pairs of short words, then match the corpus in the corpus, use the scoring algorithm to evaluate the advantages and disadvantages of the translated context and corresponding translated phrases, and increase the marking range, which can effectively improve the score. This is also a new idea of algorithm innovation. Finally, it forms the result of machine translation [11]. Therefore, the overall effectiveness of the constructed phrase corpus plays a vital role in machine translation algorithms. Figure 1 shows the flow of phrase corpus information.

The phrase corpus constructed based on the English translation model of intelligent recognition in this paper contains 740000 words, which can meet the needs of constructing 22000 sentences and 12000 phrases. It can be seen from the phrase corpus information in Figure 1 that the phrase corpus is targeted. This paper selects the phrase corpus of English-Chinese machine translation. The phrase corpora of English and Chinese are labeled, respectively, and the tenses of different phrase corpora are distinguished [12]; the marking method of the corpus consists of three parts: data, level, and processing method. The type of data is in text format. The level selects part of speech and alignment. The processing method adopts man-machine active communication, direct interaction, and a series of routine process operations of English translation, so as to promote the accuracy of phrase corpus translation [13].
2.2. Corpus Part of Speech Recognition Based on Intelligent Recognition
In the machine intelligent recognition algorithm, phrase part of speech recognition is particularly important. It can deal with a large number of grammatical ambiguities of phrases, sentences, and words. Words in short sentences can be divided by tagging the content of the phrase corpus [14]. The words in English sentences exist independently. We can realize the word segmentation of Chinese words and sentences, judge the part of speech of translated sentences and words, and finally analyze the phrase dependency by syntax to realize the creation of the sentence syntax tree [15]. Using this method, the accuracy and effectiveness of the machine translation process can be further improved, and the ability to process the phrase corpus can be improved. The GLR algorithm is widely used in part of speech recognition, which is used to judge the relationship between the front and back of the phrase, and takes the dynamic recognition form as the basic unconditional transfer statement. GLR does not detect grammatical ambiguity in the process of phrase translation, so it requires reduplication and calibration [16]. If syntactic ambiguity is detected, the syntactic analysis geometric structure is used to linearly call the analytical linear table to identify the content of the phrase, the quality of the content is improved by using the principle of local optimization, and the symbols are transmitted through different recognition channels, so as to improve the accuracy of the recognition results [17]. Generally, because the GLR algorithm in the part of speech recognition results is accidental, and the identified data points have high coincidence probability, it cannot meet the accuracy of the existing part of speech recognition [18]. The improved classical GLR algorithm analyzed in this paper uses the phrase center to analyze the phrase structure, reduce the coincidence probability of data points, and improve the accuracy of part of speech recognition [19]. The quaternion cluster is used to calculate the likelihood of phrases before and after the improved GLR algorithm: where is the start symbol cluster, is the element in , is the cyclic symbol cluster, is the termination symbol cluster, and is the phrase action cluster [20].
If represents any action in and exists in , it can be deduced that where represents the symbol on the right side of the action, constraint value, center point symbol, and marking method [21].
The improved GLR algorithm stipulates that the top symbol of the identification linear table is consistent with , the constraint value must be true, and the center point symbol must be a numerical value, which cannot be null. Only the recognition results of more than three standards are the recognition results of phrase parts of speech [22].
2.3. Contextual Feature Extraction
The most basic function of context should be the function of restriction and interpretation, and other functions are derived from them. The so-called “restrictive function” refers to the restrictive effect of context on language research and application. The explanatory function of context is aimed at readers, listeners, and language analysts. It refers to the ability of context to explain some language phenomena in speech activities. The rich content of English context determines that the context has some characteristics, including certainty, relativity, hierarchy, transitivity, and reflexivity. Based on the context feature extraction knot, the neural machine translation model of the traditional neural network integrating the attention mechanism is used as the neural machine translation model for English translation, so as to improve the quality of English translation.
According to the corpus part of speech identified above, this paper introduces the feature extraction algorithm, extracts the mapping of the best context into the translation process through the feature extraction algorithm, completes the standard extraction of context features, and describes the extracted best context through the semantic ontology mapping model [23]. Suppose that there are translation contexts in the translation process, including class semantic translation, the number of translation contexts is , and the probability for class semantic translation is and are the results of a directional -dimensional vector [24]. The translation context that can reach the basic standard through the limited process is as follows: where is the translation context that can translate semantics [25]. The selection process of the best context is
Calculate the nonsemantic translation context matrix and the suitable semantic translation context matrix , respectively, as follows:
Let be the optimal context of semantic context relevance matrix and be the standard to measure the relevance of semantic context, then the value of can directly reflect the mapping of relevance process [26]. The semantic context correlation matrix has at most optimal translation contexts, and the extracted optimal contexts are ; then, the characteristic semantics in the optimal context can be expressed by :
After the above process, the extraction process of the optimal context in the translation process is completed [27].
2.4. English Translation Model
According to the above extracted optimal context features, the neural machine translation model of traditional neural network integrating attention mechanism is used as the English translation model. The neural network is a kind of deep learning. Neural machine translation is mainly composed of encoder, attention mechanism, and decoder. It is a unified neural network that can be optimized end-to-end [28]. The core of the neural machine translation model is to integrate RNN and its variants (including LSTM and Gru), CNN, and other traditional deep learning models into the encoder-decoder framework to realize the machine translation process [29]. For convenience of explanation, this paper takes the Gru (gate recurrent unit) network as the basic network of the framework [30]. Given the source language sentence , the neural machine translation adopts the discriminant modeling method and uses the neural network to directly predict the conditional probability of generating the target language sentence :
That is, the English translation uses all the information at the source end and part of the translation information at the target end to generate the next target word to be output [31]. The whole process will be repeated until all the target words in the target sentence are predicted one by one. In this process, the source information is the core of the translation model and gives the source semantics required to generate the target word , so that the generated translation can faithfully reflect the meaning of the source sentence; part of the target translation information is the core of the language model. It gives the sentence context of the target word , which can help the generated translation to be fluent and natural [32]. Figure 2 shows the overall structure of English translation.

In order to extract the semantic feature information to be expressed by the source language sentence, the encoder uses a two-way Gru network. The advantage of this is that the left and right context information of the word is considered, which makes the extracted feature information more expressive. The obtained information is vector encoded into continuous space [33]. For a given source language sentence , the calculation process of the encoder is as follows: where represents the implicit state information from left to right at time in the encoder, represents the vector representation of the source language word at the current position at time , represents the implicit state information from left to right at time calculated by taking and as the inputs of the Gru network, and the meaning of from right to left is similar [34]. Finally, indicates that is spliced as the hidden state information at time , and the calculated results are combined with the attention mechanism to generate the input of the decoder.
The function of the decoder is to take the results calculated by the encoder as input and generate target language sentences word by word [35]. The decoder is equivalent to a conditional language model, which is similar to the encoder, but the decoder adopts a unidirectional RNN network. Here, the unidirectional Gru network is still taken as an example to explain the calculation process of the decoder. Specifically, the calculation process of the decoder is as follows: where represents the input of the Gru network, represents the implicit state representation at a certain time, represents the context vector representation obtained by the attention mechanism, and represents the neural network model, which is used to map the input information with the thesaurus of the target word.
The maximum likelihood function is used to train and optimize the model [36]. Specifically, given the sentence pair of source language sentence and target language sentence, the form of the likelihood function is expressed as follows: where represents model parameters. The above formula shows that the target translation result can be generated as much as possible by given training bilingual parallel corpus sentence pairs and given source language . For the end-to-end neural machine translation model, the parameter updating process generally adopts the gradient descent algorithm.
After the training of the neural machine translation model, it is necessary to select the best translation from the candidate translation space generated by the model; that is, the following conditions are met: where represents the candidate translation space composed of all candidate translations.
3. Simulation Experiment Analysis
3.1. Experimental Environment and Parameter Setting
In order to verify the effectiveness of the English translation model based on intelligent recognition and deep learning in practical application, a simulation experiment is carried out. The hardware environment of the experiment is NET® core™ 78550U CPU@180 GHz 1.9 GHz, 16.0 GB memory, 512ssd, Windows l0 professional operating system. The open source deep learning framework TensorFlow of goge company is used for model training and testing, and the four card rtx2080ti is used for computing resources. The experimental data analysis software environment and test environment are PyCharm 2017 professional edition, and the training time is about 52 hours. The experimental parameter settings are shown in Table 1.
3.2. Experimental Data Set
The training corpus includes three sources: 2018 AI challenge English-Chinese bilingual parallel corpus, including 5M Chinese-English bilingual parallel corpus (M represents millions of sentence pairs), 3M Chinese monolingual corpus, and 2M English monolingual corpus. WMT Chinese-English machine translation training in 2018 has 12M bilingual parallel corpora, 6M Chinese monolingual corpora, and 4M English monolingual corpora. In 2018, CWMT has 6 M Chinese-English parallel corpora, 2M Chinese monolingual corpora, and 2M English monolingual corpora. These monolingual sentences have nothing to do with each other. In other words, for a Chinese sentence, there is no corresponding translation in English. Take the translation corresponding to these monolingual data as the reference translation for comparison with the generated translation; select nist2006 data set for validation set; the test set is nst2005 data set, nist2008 data set, and nist2012 data set. Among them, there are three reference translations in each test set. For Chinese-English comparison, the first English sentence of the three reference translations is used as the source language sentence, and the Chinese sentence is used as an independent reference translation. In order to limit the scale of the thesaurus, the first 60000 high-frequency words in the training corpus are used to construct the thesaurus, and the other low-frequency words are marked as <unk>. In the training corpus, the dimension of the word vector is set to 512 and the length of the sentence is limited to 50.
A pair of Chinese-English bilingual sentence pairs are randomly selected from the corpus data, as shown in Figure 3.

3.3. Analysis of Experimental Results
The English translation model based on intelligent recognition and deep learning designed in this paper, the intelligent recognition English translation model based on the improved GLR algorithm designed in Reference [6], and the syntax-based neural machine English translation model designed in Reference [7] are used to test the English translation of the above samples. The test results are shown in Figures 4–6.



According to Figure 4, according to the intelligent recognition English translation model based on the improved GLR algorithm designed in reference [6], the generated English translation encountered the problem of “undertranslation,” and some words in Chinese sentences were not translated, such as “shout” which was not translated. The syntax-based neural machine English translation model designed in Reference [7] produces the problem of “overtranslation,” in which “singing” is translated twice; the English translation model based on intelligent recognition and deep learning designed in this paper has very high similarity with the reference translation. Except for the tense problem, there are no serious errors. Although the translation of some Chinese words may be inconsistent with the reference translation, the semantics is correct, which may be related to the training corpus used in the model training process. Therefore, the English translation model based on intelligent recognition and deep learning designed in this paper has a good effect on English translation.
In order to verify the effectiveness of this model, the English translation model based on intelligent recognition and deep learning designed in this paper, the intelligent recognition English translation model based on the improved GLR algorithm designed in Reference [6], and the neural machine English translation model based on syntax designed in Reference [7] are used to identify the part of speech of English phrases and verify the accuracy of the three models. The comparison results are shown in Figure 7.

According to Figure 7, the accuracy of English phrase part of speech recognition based on intelligent recognition and deep learning designed in this paper can reach 100%, and the accuracy of English phrase part of speech recognition based on the improved GLR algorithm designed in Reference [6] can reach 60%; the syntax-based neural machine English translation model designed in Reference [7] has a maximum accuracy of 78% for English phrase part of speech recognition. The English translation model based on intelligent recognition and deep learning designed in this paper has a high accuracy for the English phrase part of speech recognition.
In order to further verify the effectiveness of this model, the English translation model based on intelligent recognition and deep learning designed in this paper, the intelligent recognition English translation model based on the improved GLR algorithm designed in Reference [6], and the syntax-based neural machine English translation model designed in Reference [7] are used to compare and analyze the English translation time. The comparison results are shown in Figure 8.

According to the analysis of Figure 8, the translation time of the intelligent recognition English translation model based on the improved GLR algorithm designed in Reference [6] is between 3 s and 14 s, the translation time of the neural machine English translation model based on syntax designed in Reference [7] is between 5 s and 20 s, and the time consumed by the English translation model based on intelligent recognition and deep learning is within 5 s. It shows that the English translation time of the model designed in this paper is shorter and more efficient.
4. Conclusion
Due to the rapid development of globalization, the information flow between different countries shows high speed, and English has become the main language of international communication. Machine translation is a very difficult and important task in the field of natural language processing. It closely follows the development of computer technology, artificial intelligence, and mathematical logic. From the early rule translation of dictionary matching combined with linguistic expert knowledge to corpus-based statistical machine translation, with the significant improvement of computer computing power and the explosive growth of multilingual information, machine translation technology gradually stepped out of the ivory tower and began to provide real-time and convenient translation tasks for general users. Machine translation has been studied in academia for a long time, and researchers have done a lot of work for it. In the process of English translation, the part of speech of phrases identified by the traditional translation model is inaccurate, which leads to unsatisfactory English translation effect and low translation efficiency. With the development of “deep learning technology” and its addition to machine translation, machine translation can continue to review and understand complex sentences like human translation and translate with context at the same time. In order to improve the accuracy of English phrase part of speech recognition and the poor effect of English translation and reduce the time of English translation, this paper designs an English translation model based on intelligent recognition and deep learning and verifies that this model can translate English quickly and accurately through simulation experiments, which proves that the accuracy of English phrase part of speech recognition and translation effect of this model are good, and the translation time is short. It lays a foundation for the further development of machine translation.
Data Availability
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Conflicts of Interest
The authors declared that they have no conflicts of interest regarding this work.
Acknowledgments
This work was supported by the Education Reform Project of Heilongjiang Province (SJGY20200114), Creative Foundation of Northeast Petroleum University (2020YTW-W-03), and Project for Innovation and Entrepreneurship course construction of Northeast Petroleum University.