Abstract
Using traditional neural network algorithms to adapt to high-resolution range profile (HRRP) target recognition is a complex problem in the current radar target recognition field. Under the premise of in-depth analysis of the long short-term memory (LSTM) network structure and algorithm, this study uses an attention model to extract data from the sequence. We build a dual parallel sequence network model for rapid classification and recognition and to effectively improve the initial LSTM network structure while reducing network layers. Through demonstration by designing control experiments, the target recognition performance of HRRP is demonstrated. The experimental results show that the bidirectional long short-term memory (BiLSTM) algorithm has obvious advantages over the template matching method and initial LSTM networks. The improved BiLSTM algorithm proposed in this study has significantly improved the radar HRRP target recognition accuracy, which enhanced the effectiveness of the improved algorithm.
1. Introduction
Radar high-resolution range profile (HRRP) is the sum of projection vectors of the echoes received in the radial radar after the signals emitted by wideband radar are scattered by targets. HRRP contains the structure distribution information of target scattering points in the radial direction of the receiving antenna [1]. Through analysis, the size structure of the target itself and the parameters of equivalent scattering center distribution can be obtained, which is an important data source for target recognition and classification [2, 3]. SAR images have abundant two-dimensional structural information of targets. However, imaging requires movement posture angle accumulation of marks [4]. In practical applications, it is difficult to obtain high maneuvering flight process data of non-cooperative targets [5, 6], while HRRP is easy to get [7]. The technology to suppress distance ambiguity is mature enough [8]. Therefore, HRRP is widely applied in the field of radar automatic target recognition (RATR). In the fast and accurate recognition and classification of targets by HRRP, how to extract comprehensive features from known data and complete the analysis and processing of target information is a hot topic in this study.
In response to the above problems, many researchers have carried out extensive experiments and studies [9, 10]. Duran [3] proposed a parametric statistical distribution model for azimuth sensitivity, translation sensitivity, and range sensitivity of radar HRRP. Zhou et al. [11] proposed a subspace fuzzy optimization transformation (FOT) method, which preserves the local structure and maximizes the distance between clustering centers. This kind of algorithm requires a large amount of data and has low recognition accuracy. Reference [12] uses sparse coding dictionary learning to recognize targets. The above methods all analyze the target and lack sufficient attention to the correlation between radar HRRP range cells. Subsequently, a radar HRRP model based on the hidden Markov model is proposed in the Reference [13–16], and HMM is used to calculate the transition probability of sequence data of multirange cells. In this method, the HRRP database and database index are established, and paragraphs are divided according to the azimuth angle. Statistical modeling is carried out for each paragraph at the training time. The maximum posterior probability is calculated when the sample belongs to the segment. Finally, the target recognition is carried out according to the matching similarity probability. This method makes use of the sequence correlation of HRRP data. However, the segmentation recognition method requires much data samples and many calculations. The accuracy of the model trained in the early stage will significantly affect the classification accuracy during verification.
Nowadays, neural networks have many applications in the field of target recognition [17–19]. The sequence correlation of HRRP has received a lot of attention and application in RNN. Reference [20] proposes an attention model based on a cyclic neural network according to the time series characteristics of HRRP. The attention model encodes the time domain data after giving different weights to the data in each distance unit. The hidden layer coding features are used to recognize the target. Reference [21] uses a bidirectional self-circulating neural network combined with an attention mechanism to intercept the forward and reverse bidirectional data-sliding windows of time domain HRRP data and inputs them into independent GRU networks. After training, the features are spliced for target recognition and classification. Document [22] uses a bidirectional long short-term memory network to extract target feature information and then performs fusion to output classification results. Reference [23] realizes deep feature mining of samples by stacking long and short memory networks, thus obtaining a better recognition. At present, a deep belief network [24] is also a common practice. Reference [25] creates a cyclic recurrent gamma belief network (RGBN) to extract the deep structural features of the target. In addition, a mixture of random gradient Markov chain Monte Carlo (MCMC) and a cyclic variational reasoning model is proposed for scalable training and fast out-of-sample prediction. This kind of method uses a cyclic neural network to analyze and process the data of sequence structure, weakens the segmentation requirement of target attitude angle, explores the correlation within the sample, and analyses the characteristics of the model itself and the structural correlation characteristics of the internal information.
2. The Proposed Method
In the traditional HRRP data processing, due to the high dimension of data and the existence of information redundancy, the method of dimension reduction before statistical recognition is adopted, and its type is determined by checking the posterior probability of the sample to be tested. Such methods mainly include AGC, gamma model, and gamma mixture model [26]. However, the above example has a low degree of freedom. These methods are not suitable for target recognition of small sample data. Moreover, the description of the statistical characteristics of the target is not comprehensive, and the features of the target cannot be fully captured, thus affecting the accuracy of target recognition. In addition, the factor analysis (FA) model can better capture the characteristics of the target. However, its robustness is not strong, which is greatly affected by the attitude sensitivity of the target. In order to get a more comprehensive model of the target, factor analysis needs to master much target data for modeling statistics in the early training stage. In the above method, HRRP data are regarded as a combination of independent range cell echo sequences without considering the correlation between radar-received subechoes. The relative position of the scattering center is fixed, because of the relative change of the radar line-of-sight direction caused by the attitude change during the movement. Therefore, the position of a particular scattering center will shift on the distance unit. The echo data of all scattering centers of the target in the three-dimensional space are mapped in the one-dimensional space between the target and the receiving antenna. Fluctuation changes between the distance units will be formed based on the shift changes of the scattering center.
Time series correlation in each distance unit is an essential means for target recognition. To fully extract sequence correlation features [27], this study uses a bidirectional long short time network and gate structure to model the data. It proposes a dual parallel network model based on a cyclic neural network. The model extracts features from HRRP sequence data by multichannel coding and then adds the output features of each classification network by dynamic weight fusion. Compared with previous algorithms, this model has the characteristics of a small amount of training data and no need to manually segment and establish templates. In addition, the model can fully extract the structural features of samples through the network characteristics of long-term memory, which is more in line with the theoretical basis of HRRP data. When fusing multichannel target features, the network model is more robust by adjusting the weights, and the recognition accuracy is always kept at a high level. Experiments based on measured data verify the effectiveness of this model.
The innovations of the model are as follows:(1)To fully improve the feature extraction ability of LSTM and improve the accuracy of target recognition by the network, a multilayer bidirectional LSTM network model is introduced to process HRRP sequence recognition tasks. The network structure is shown in Figure 1(2)To distinguish the vital information of sequence, a dual parallel network structure is proposed to process sequence tasks(3)That emotional weight adjust fusion mechanism is set according to the size of the target information data extract by the independent sequence network
2.1. Dual Parallel Sequence Network Structure
The input data are the HRRP sequence of the aircraft target. The high-resolution one-dimensional range profile data are the vector accumulation in each scattering echo unit of the target in the opposite direction along the incident direction received by the wideband radar, i.e., the backscattering echo integration in the whole range cell space received by the radar antenna, i.e.,where is the transmitter’s transmission power, is the antenna gain, is the working wavelength, is the radar cross section (RCS) of the target, is the distance between the target and the radar antenna, and and are the system and atmospheric loss.
At the time , the backscatter power within a range cell at the radar receiving end is calculated as
According to the scattering point model, the scattering point is an ideal geometric point. If the transmitted signal is , for multiple scattering point targets at different distances, the echo can be written as and are, respectively, the amplitude of the echo from the i-th scattering point and the distance at a certain moment; is the normalized echo envelope; is the carrier frequency; and is the speed of light.
If the single-frequency pulse is used for transmission, the narrower the pulse is, the wider the signal frequency band is. However, it is difficult to transmit very limited pulses with very high peak power. Generally, wideband signals with a large time width are used, and the narrow pulses are obtained through processing after receiving. We change the echo signal of formula (3) to the frequency domain to discuss how to deal with it. It is expressed as
The radar echo is very sensitive on the complex plane due to the change in target attitude and distance, and the relative phase of each scattering point considerably varies. Furthermore, the echo amplitude in a single distance unit greatly varies. In the input layer of this study, radar HRRP (high-resolution range profile) uses the scattering point model to describe the vector projection of the target scatterer echo in the radar line-of-sight angle direction. One-dimensional information is used to count the three-dimensional distributed scatterer subechoes of the target for vector summation, and the projection data include the position distribution of the weak and robust scattering points of the target and the estimation of the radial radar size, which reflects the robust shape and local scattering rate characteristics of the target. Based on the above analysis, the scattering point model can be expressed as
To obtain robust echo data, the time domain characteristics of radar HRRP are usually obtained after the signal echo in each range cell is modulated. HRRP data containing T-range cells can be expressed as
2.2. Parallel Sequence Layer
2.2.1. LSTM Network Layer
The LSTM module includes two states such as memory cell state and hidden state , and three “gate” structures of input gate , forgetting gate and output gate . The detailed structure is shown in Figure 2. For input HRRP sequence data , the calculation process using the LSTM model is as follows:where ⊙ is expressed as Hadamard product, σ and tanh are nonlinear mapping function, is a hidden layer at the t-th time, is input at the t-th time, and are weight matrix of the model, and is an offset of the model.
The forgetting gate is used to reset the memory cell, and the input gate and the output gate control the input and output of the memory cell.
The memory unit calculates whether the information at the current time is discarded or not through the superposition of the calculation results of the forgetting gate and the output gate and then updates the memory unit at time . The hidden state contains the practical information hidden by the forward propagation input sequence before the index position time , and the hidden state is updated by the joint action of the current sequence index position time memory unit and the output gate . Through the collective effort of the memory unit and the hidden state , the valid data in the sequence data can be retained [28]. The information of the previous time of HRRP data can be taken into account when feature extraction is performed at time t. At the same time, it solves the problems of long term, gradient explosion, and gradient disappearance in traditional networks. The association of information is realized before and after sequence data, information is saved and transmitted for a longer time, and time series correlation is fully used to identify the input HRRP sequence [27].
2.2.2. GRU Network Layer
The structure of the GRU network is quite different from that of the LSTM network. In the GRU network unit, there are only two gate structures, which are the update gate and reset gate , and a hidden state . The detailed structure is shown in Figure 3. The reset gate is the oblivion degree of the gate setting to the previous time information, and the importance degree of the gate setting is updated to the current time information [27]. For the input HRRP sequence data , the calculation process using the GRU module is as follows: [29].
Through comparative analysis, we can conclude that both LSTM and GRU networks belong to feature extraction of time series information. The LSTM network adjusts that information reserved at different time by setting a memory unit. Thus, additional weights are set for extra time information in the sequence to extract sufficient details in the distance unit where the HRRP target is located. Therefore, the LSTM network has high weights for the input of strong scattering center time points, which are distributed in the HRRP sequence. The features are mainly represented by the above scattering center distance unit. The GRU network determines the weight adjustment of current time information and historical information through the amount of information at the present time. Similar to the update gate of the LSTM forgetting gate, the GRU network determines the degree of forgetting input information at the current time. The amount of information at the current time is large, and the weight is high. The weight of reserved information for historical state information is inversely reduced. After continuously inputting multiple bits of input , the input needs to save a large amount of information, and then, the weight increases. The weight of historical state information rapidly decreases due to the superposition of multistep calculations, and the historical hidden state information data are diluted. Therefore, the GRU network focuses on the feature extraction and recognition of paragraph information gathered by the main scattering centers.
2.3. Dropout Layer
To avoid the phenomenon of overfitting and the problem of considerable training time, the network structure proposed in this study adds a dropout layer and randomly discards 50% of hidden layer nodes in each batch of training, thereby reducing the interaction between hidden layer nodes, improving the training speed of the model, preventing the overfitting phenomenon of the model, and enhancing the generalization of the model.
2.4. Full Connection Layer
The full connection layer weights and sums the spatial feature data of the hidden layer in the network and maps the feature data to the classification results, which is equivalent to the function of a classifier. In the network structure proposed in this study, the parallel network obtains two groups of classification results, namely, the classification results for LSTM branches and the classification results for GRU branches .
2.5. Fusion Layer
The fusion layer adopts a dynamic gate structure, which selects the strategy of weight fusion adjustment to fuse the output value of the parallel loop network structure. The weights of parallel input from the full connection layer to the fusion layer depend on the state of network feature extraction. The full connection layer of the parallel network, respectively, outputs the output value of the target type number for transmission and output to carry out logical regression (softmax regression) classification, input data in the LSTM structural unit at all times pass through the input gate to generate an LSTM network input gate unit sequence , and similarly, the GRU network model will generate an update gate unit sequence along with the data input .
The output weights of the two independent models are selected to be set as the ratio of the two norms of the gating unit sequence, that is,
The fusion layer outputs a final hidden layer output sequence after weighting output weight sequences and and fusing the classification results of the full connection layers and :
2.6. Output Layer
After obtaining the hidden layer output sequence , softmax layer and classification layer classifiers are used to output the category of the sample to be tested.where represents the probability that the sample sequence to be tested belongs to class I. Finally, the classification layer identifies the type as the item with the highest probability value in class j.
3. Experiments and Analysis
3.1. Experimental Data
The experimental data source is the data obtained from the field measurement of three types of small- and medium-sized aircraft by a research institute in China using C-band broadband radar, and HRRP data of aircraft are the actual flight data of Yark-42, Cessna Citation S/II, and AN-26, respectively. The above three types of aircrafts are medium and large jet aircraft, small jet aircraft, and small and medium propeller aircraft. ISAR performance and aircraft structural parameters are shown in Table 1.
The flight path projection of the three types of aircrafts in the experiment is shown in Figure 4, covering all azimuth angles of the target flight, which has sufficient guarantees for the comprehensiveness of classification learning. The purpose of distinguishing the training and test sets without overlapping is achieved by segmenting the flight path.
(a)
(b)
(c)
According to the sections in the figure, each aircraft is divided into five to seven tracks. The training set selects 2 or 3 sections of Yak-42, 6 or 7 sections of Cessna Citation S/II, and 5 or 6 sections of AN-26. Time domain characteristics of HRRP data samples are shown in Figure 5. The training samples are 3,000 in each category, totaling 9,000 samples, which cover the data of each corner domain of the target to ensure the completeness of the training set. The other sections are used as test sets and to verify the recognition performance. The test data are 10,000 samples in each category, totaling 30,000 samples.
(a)
(b)
(c)
The same training set and test set were set up in the comparative experiment, and the sequence dimensions of HRRP data after preprocessing were 256. The network hidden layer state dimension K is set to 10. In this model, the parameters of generation training are mainly in the LSTM network structure unit and GRU network structure . The learning rate is set to . After multiple rounds of experiment comparison, the initial learning rate is set to 0.5 to speed up the convergence, and the learning rate is reduced to 1/2 of the original value every ten cycles.
3.2. Model Performance Comparison
3.2.1. Comparison of Recognition Performance
Table 2 shows the recognition performance of DPSN proposed in this study compared with other algorithms. The recognition performance is the proportion of the number of correct samples identified in the test set to the entire test set. The comparison models include maximum cross correlation (MCC), adaptive Gaussian classifier (AGC), and hidden Markov model (HMM). In addition, this study also compares long short-term memory (LSTM) and gated recurrent unit (GRU).
From Table 2, we can see that compared with the traditional model, the recurrent neural network can better learn the time series relationship in sequence data, considering the time series correlation can effectively improve the recognition accuracy of sequence data. Traditional models mostly use statistical recognition and kernel methods to obtain statistical distribution parameters of sample information for model matching recognition. However, HRRP data have high dimensions and correlation before and after the data, so it is difficult to obtain higher recognition accuracy through statistical models. At the same time, the classification based on the kernel method maps the data from the linear inseparable original space to the high-dimensional separable space, which requires too large a number of samples and too large a dimension of the kernel matrix, which significantly increases the computational complexity of the model. As can be seen from the following figure, the DPSN algorithm has significantly higher recognition accuracy than other types of networks in the comparison of network models using sequence input. The network depth in the above table is set to 10. We can find that the DPSN algorithm can maintain a good recognition effect.
3.2.2. Comparison of Recognition Performance by Changing Network Depth
Figure 6 is a comparison of recognition accuracies of network depth changes. The DPSN has little changes in recognition accuracy and high robustness with network depth changes. The performance of the algorithm is still good after the depth changes. Compared with LSTM, GRU, and other network models, it is found that the recognition performance depends very much on adjusting depth parameters. Taking the BiLSTM network as an example, the number of depth layers is set to 5 and 6, and the recognition accuracy is 15% different. Therefore, the dual parallel network can better overcome the influence of the depth of the sequence network on target recognition, and the longitude fluctuation of recognition is slight, so it has better target classification and recognition ability.
3.3. Comparison of Deep Robustness Experiments
3.3.1. Comparison of Recognition Performance by Monte Carlo Simulations
Table 3 shows the stability comparison of network recognition accuracy with the change in network model depth. The experiment selects 2–20 layers to traverse the network depth. After 100 Monte Carlo simulations, the experimental data of recognition accuracy of different algorithms are obtained. The average value is the average accuracy of all recognition results of the algorithm, and the variance is the statistical variance value of recognition results. From the data, the average recognition accuracy of DPSN is higher than other algorithms. Statistical variance accuracy value is lower than that of the same class of algorithms. Experiments show that the DPSN algorithm is reliable, its recognition performance does not violently fluctuate with the change in network depth, and it has good robustness.
3.3.2. Comparison of Recognition Performance by Reducing Training Data
Figure 7 is a comparison chart of recognition accuracy of the different algorithms with a reduced amount of training data. The abscissa is the scaling of multiple training data, a value of 1 does not reduce the size of the data, 10,000 training data are used for each type of target, and a total of 30,000 pieces of training data can be seen from the figure. When the training data are reduced by ten times to 10% of the original data volume, when only 1,000 pieces of training data are retained in various types, all kinds of algorithms generally maintain stable recognition performance. After further reducing the data volume, the DPSN algorithm still maintains stable recognition performance, while LSTM, GRU, and BLSTM all produce a corresponding 15% reduction in recognition accuracy. DPSN algorithm still maintains good recognition performance under small sample data. Besides, the recognition performance of DPSN keeps a high level with the change of network depth, and there is no violent fluctuation. A good recognition can still be achieved when the network depth is low.
4. Conclusions
Aiming at the problem of radar HRRP automatic target recognition, a HRRP target recognition method based on a dual parallel network is proposed. By setting up an independent dual parallel time series network, the radar one-dimensional sequence data are extracted, and the time series correlation features between the range cells of the target sequence are captured. On this basis, a fusion mechanism of dynamic gate structure weight fusion adjustment is proposed, which effectively improves the stability of feature extraction and is very robust to the depth change of the network. Experiments show that the proposed model can extract compelling features for recognition, with good recognition performance and high network robustness.
Data Availability
The data used to support the findings of this study are included in the article.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
The study was supported by the Research Team Development fund (F3504, Feature Extraction and Recognition Technology of Aerial Target’s Track).