Abstract
For the problem of reliable decision in synthetic aperture radar (SAR) target recognition, a method based on updated classifiers is proposed. The convolutional neural network (CNN) and support vector machine (SVM) are used as basic classifiers to classify samples with unknown target labels. The two decisions are fused and the reliability of the fused decision is evaluated. The classified test samples with high reliabilities are added to the original training samples to update the classifiers. The updated classifiers have stronger classification abilities and the fused result of the two classifiers can obtain a more reliable decision. The proposed method is tested and verified based on the moving and stationary target acquisition and recognition (MSTAR) dataset. The experimental results verify the effectiveness and robustness of the proposed method.
1. Introduction
High-resolution synthetic aperture radar (SAR) can provide strong support for earth observation. In the military field, SAR images can be used to detect and identify targets of interest to obtain high-value intelligence. The problem of SAR target recognition is a specific application of traditional pattern recognition technology [1]. With the support of labeled training samples, the classification of unknown test samples can be performed. Feature extraction and classifier design are two important steps in the SAR target recognition methods. The former obtains high discriminative features through the analysis of SAR images, thereby improving the overall accuracy and efficiency of subsequent classification. Commonly used SAR image features mainly describe target geometry [2–7], electromagnetic scattering characteristics [8–10], or distributions of pixel values by projection or transformation [11–18]. The classifier learns a reliable decision-making surface based on a large number of training samples and then classifies the test samples. The classifiers commonly used for SAR target recognition include nearest neighbor (NN) [11], support vector machine (SVM) [19–22], and sparse representation-based classification (SRC) [23–27]. In addition, the multiclassifier fusion is also used in SAR target recognition [28–31]. In recent years, the deep learning algorithms have been widely applied and verified in various research fields, and a large number of SAR target methods based on deep learning models have emerged. A typical representative is the convolutional neural network (CNN) [32–45]. The deep learning methods integrate feature learning and classification, thus avoiding the separation of feature extraction and classifier design in traditional methods. However, the deep learning methods have a large demand for the training samples, and the final classification accuracy is often poor when there are fewer training samples. Different from the field of optical image processing, the data resources of SAR images are scarce. As a result, it is difficult to train a reliable deep learning classification model, which brings obstacles to the application of related methods. In [40, 41], more available samples were generated by means of data augmentation to improve the classification performance of the networks. In [42], multisource image data (such as optical images and electromagnetic simulation data) were processed through transfer learning to assist in training neural networks suitable for SAR target recognition.
This paper proposes a SAR target recognition method based on updated classifiers. The key idea is using those test samples whose categories can be reliably confirmed during the classification process to optimize the original classifier. In detail, a CNN is first designed as the dominant classifier for SAR target recognition. In addition, this paper selects SVM as an auxiliary classifier and confirms the classification result of the test sample together with CNN. The original training samples are used to train the CNN and the SVM classifiers. For a certain test sample to be recognized, it is first classified by CNN and SVM respectively to obtain the corresponding decision values. Then, the final decision value is obtained through the weighted fusion. Based on the fused decision values, the reliability of the final decision is calculated. When the current test sample has a higher reliability level than the preset threshold, it is furthered used as a training sample to optimize the CNN and also enhances the training samples for SVM. With the continuous increase of test samples with confirmed labels, the classification performance of CNN and SVM is continuously enhanced. So, the recognition results obtained by the fusion of the two classifiers are more reliable. The main contributions of this paper are as follows: first, the updated mechanism is introduced under the decision fusion framework of CNN and SVM classifiers. Decision fusion of different classifiers is a common way to improve decision accuracy. In the traditional methods, the classified test samples are not fully used, and the online update of the classifier is lacked. In the problem of SAR target recognition, the number of training samples is very limited. This paper confirms the test samples and updates the classifier, which can effectively improve the classification ability of the classifiers. Second, an effective criterion for decision-making reliability is developed and used for the selection of test samples. Although the accuracy of fusion decision is improved, there is still some probability of misclassification. The introduction of test samples with the wrong target labels will result in a decrease in the performance of the updated classifiers. Based on the fusion of probabilistic decision variables, this paper defines decision reliability levels to select test samples whose decision reliability is higher than the preset threshold to update the classifier. So, the effectiveness of the updated classifiers can be as ensured. In the experiments, the proposed method is evaluated on the moving and stationary target acquisition and recognition (MSTAR) dataset. The results show that the proposed method has advantages over a single classifier and traditional classifier decision fusion methods.
2. Basic Classifiers
2.1. CNN
CNN is an extension of traditional neural networks, which can be used for two-dimensional signal (image) processing. It conducts deep mining of original data by setting multiple different convolutional layers. In each convolutional layer, different convolution kernels are used to extract two-dimensional features from the original data, so as to obtain multilevel features. Finally, the network builds an efficient classification framework through end-to-end training to realize the classification of test samples. At present, CNN has been widely used in the field of image processing, and a series of CNN-based SAR target recognition methods have appeared [32–45]. It should be pointed out that the classification performance of CNN is closely related to the size and coverage of training samples. When the number of training samples is small, the finally trained network has poor adaptability and cannot handle the SAR target recognition task well.
Based on the existing research studies, this paper designs a CNN as shown in Figure 1, which includes three convolutional layers, three maximum pooling layers, and two fully connected layers. In each convolutional layer, the rectified linear unit (ReLU) is used as the activation function to enhance the nonlinear adaptability of the network. The maximum pooling layer is set after the convolutional layer to improve the overall training efficiency of the network. Finally, the conversion of input data to category labels is achieved through two fully connected layers (take 10-class recognition as an example). The end of the network uses Softmax as the basic classifier and outputs the possibility that the test sample belongs to each class in the form of posterior probability. In addition, the overall complexity of the network structure is relatively low, which is beneficial to improve the efficiency of overall SAR target recognition.

2.2. SVM
SVM is chosen as another classifier in this paper. Since first proposed by Vapnik et al. in 1995, SVM has been one of the most popular classifiers in the field of pattern recognition. In Zhao and Pricipe [19], SVM was first introduced to SAR target recognition with good performance. Afterwards, many relevant researches were developed on SVM and improved the recognition performance [20–22]. Based on the principle of structural risk minimization, SVM aims to find a hyperplane to separate the patterns from two different classes in a two-class classification problem. For a test sample , SVM generally makes the decision as follows:
In equation (1), represents a support vector from the training samples; denote the labels of the two different classes. are the weights and is the bias, which are the parameters to be estimated during the training process. is the kernel function, which can be specifically designed to handle different types of classification problems. For example, the polynomial and radial basis function (RBF) kernels are two commonly used kernel functions in SVM.
The traditional two-class SVM can be generalized to multiclass ones via the strategies such as “one-versus-one” or “one-versus-rest”. Then, SVM can be directly used to classify multiple classes. The famous LIBSVM [46] provided an excellent toolbox to use SVM for different applications, which can be smoothly used for multiclass recognition problems. In this paper, the multiclass SVM with RBF kernel is employed to perform the classification for SAR images.
3. Updated Classifiers for Target Recognition
3.1. Principle for Decision and Updating Classifiers
For the classification results from CNN and SVM, this paper uses a weighted fusion algorithm to obtain the final decision values. CNN uses Softmax as the classifier, and its output decision value is the posterior probability vector . For SVM, the output result is also in the form of probabilities, denoted as . Afterwards, the classical weighting (equal weight) algorithm [24, 25] is used to fuse the posterior probability vectors of CNN and SVM, as follows: where denotes the final decision variables after fusion. Accordingly, this paper defines the decision reliability as follows:where is the maximum probability, so the reliability level is .
Correspondingly, the larger the value , the more reliable the classification result. An appropriate decision threshold is set. When the decision reliability is higher than the threshold, the current decision is considered reliable. And the corresponding test sample is added to the original training samples to update CNN and SVM. Otherwise, the training set is not updated.
3.2. Procedure of Target Recognition
In this paper, the training set is dynamically updated by analyzing the reliability of the decisions of test samples. Therefore, the classification performance of CNN and SVM can be enhanced so their fused result will be more reliable. The key steps of the implementation of the proposed method are shown in Figure 2, which can be summarized as follows: Step 1:the original training samples are used to train the CNN shown in Figure 1 and the SVM classifier. Step 2:for the test sample with unknown target label, CNN and SVM are used to classify them respectively. Their results are fused and the reliability level of decision-making is calculated. Step 3:if the decision-making reliability of the current test sample is higher than the preset threshold, it is added to the original training samples and dynamically updates the training set of CNN and SVM.

With the increasing number of test samples with confirmed target labels, the CNN and SVM classification capabilities have also been continuously enhanced. Therefore, the classification accuracy of subsequent test samples can also be improved. In general, the overall accuracy of target recognition can be improved.
4. Experiments
4.1. MSTAR Dataset
The MSTAR dataset is a representative dataset for testing SAR target recognition methods, which is employed for a long time. Figure 3 shows the optical and SAR images of the 10 targets in the dataset. Among them, the SAR images can cover all-round aspect angles and several depression angles, and the image resolution is 0.3 m. Through flexible processing and simulation on the original SAR images, a variety of operating conditions can be setup to test the proposed method including the standard operating condition (SOC) and extended operating condition (EOC).

Several types of reference methods are set up in the experiments, including the method of SVM-based classifier in [19] (denoted as SVM), the cascade coupled CNN designed in [36] (denoted as CSCNN), the CNN with data augmentation used in [41] (denoted as Aug-CNN), and the method based on the decision fusion of SVM and SRC in [22] (denoted as SVM + SRC).
In the follow-up experiments, the validation is first carried out under SOC to analyze and verify the basic performance of the proposed method. Then, several typical EOCs are established to verify the robustness of the proposed method under the conditions of depression angle variance, noise corruption, and reduced training set.
4.2. Results and Discussion
4.2.1. SOC
Table 1 shows a typical SOC based on the MSTAR dataset. The training and test sets contain the SAR images of the 10 types of targets acquired at the depression angles of 17° and 15°, respectively, and both cover 0°∼360° azimuth angles. With the reliability threshold set to 1.4, the proposed method classifies the test samples and obtains the corresponding results. The classification accuracy of different targets keeps higher than 98%. The recognition rates of BMP2 and T72 are relatively low due to the configuration differences between their training and test samples. Among the remaining 8 types of targets, the recognition rate of BTR60 is relatively low, mainly because its test samples share high similarity with the samples of BMP2 and T72 targets, which increases the probability of false classification. The four types of reference methods are also tested under the same condition, and the average recognition rates of all methods for 10 types of targets are summered in Table 2. The comparison shows that the recognition performance of the proposed method under SOC is better than that of the reference methods, which verifies its stronger effectiveness. Compared with the methods using SVM or CNN alone, this paper confirms the test samples and enriches the training set and organically combines the classification results of the two classifiers to significantly improve the final classification accuracy. The Aug-CNN method improves the performance of traditional CNN by simulating training samples, but the overall amount of information in the original training set is still limited. As a result, the improvement is also very limited and it will also bring additional time consumption. Compared with the fusion method of SVM and SRC, the proposed method dynamically updates the two classifiers while fusing CNN and SVM, and the enhancement of the final recognition performance is also very obvious.
According to the basic principles of the proposed method, the selection of the reliability threshold has a direct impact on the final classifier update and recognition performance. In the previous experiment, the reliability threshold was set to be 1.4. This experiment mainly verifies this key parameter and investigates its influence on the final recognition performance. Table 3 lists the average recognition rates of the proposed method for 10 types of targets at different thresholds. It can be seen that the proposed method can maintain an average recognition rate over 98.5% at each threshold, but its performance is also directly related to the threshold selection. When the threshold is relatively low (for example, 1.1), the decision reliability is easy to meet the requirements, and more samples are used for updating the classifiers. However, some of the misclassified samples will also cause the performance of the classifier to decrease, resulting in a decrease in the overall average recognition rate. When the threshold is large (for example, 1.7), the decision reliability is very high at this time, resulting in only a few test samples that can be used for classifier update, which makes the proposed method degenerate into a traditional decision fusion method. After comparative analysis and repeated experiments, this paper selects 1.4 as an appropriate threshold. This threshold can effectively ensure the rationality of test sample selection and the reliability of classifier update.
4.2.2. Depression Angle Variance
The difference in depression angle between the test sample and the training set will cause the characteristic difference of the SAR images. Table 4 shows the training and test sets under the condition of depression angle variance set using the MSTAR dataset, which includes 3 types of targets. Among them, the training set uses SAR images with a depression angle of 17°, and the training set contains samples with two depression angles of 30° and 45°. Experiments are carried out at the two depression angles, respectively. The average recognition rates of various methods are summarized as shown in Table 5. It is easy to find that the recognition performance of various methods at 45° depression angle is significantly lower than the results at 30°, indicating that large depression angle difference will seriously affect the correct decisions. Through the use of highly reliable test samples and the fusion of CNN and SVM, the proposed method achieves better recognition performance than the reference ones. Compared with the CSCNN, SVM and Aug-CNN methods using single classifiers, the performance superiority of the proposed method is very significant, which demonstrates the effectiveness of decision fusion and dynamic update of the classifiers. Compared with the SVM + SRC method, this paper introduces CNN with stronger classification performance. In addition, those samples after reliable classifications are used to update CNN and SVM, so the final decision result has higher reliability.
4.2.3. Noise Corruption
In order to test the performance of the proposed method under noise corruption, different degrees of noise are added to the original test samples in Table 1 to construct test sets at different signal-to-noise ratios (SNR). All the methods are tested at different SNRs, and the results are shown in Figure 4. It can be seen that the proposed method maintains the highest recognition rate at each SNR, showing its better noise robustness. The performance of the Aug-CNN method ranks second only to the proposed method, mainly because the method adds noisy samples generated by simulation to augment the training set, which improves the noise adaptability of the network. Since the training samples used in the CSCNN method are all from high SNRs, its classification accuracy under noise corruption, especially test samples from low SNRs, is significantly reduced. The proposed method uses the combination of CNN and SVM to dynamically update the training samples to improve the coverage of noise corruption situations, and the final decision can better handle the noisy samples.

4.2.4. Reduced Training Set
As mentioned above, the training samples available in SAR target recognition are often very limited. As a result, it is difficult to cover the possible situations of the test samples (such as view angles and noise levels). Therefore, it is very important to improve the robustness of the recognition method under the condition of limited training samples. Based on Table 1, 80%, 60%, 40%, and 20% of the original training set are randomly selected to construct reduced training sets and then the original test sets are classified. The results of different methods achieved under reduced training sets are shown in Figure 5. The proposed method has obvious performance advantages in the case of fewer training samples. On the one hand, the fusion of CNN and SVM at the decision-making layer improves the fault tolerance of the overall decision-making. On the other hand, with the increasing number of confirmed test samples, the training set has actually been effectively supplemented, so the updated CNN and SVM classification results are more reliable. The four types of reference methods are limited by the size of the training samples, so their overall recognition performance is also limited.

5. Conclusion
The paper proposes a SAR target recognition method based on updating the classifiers, which continuously updates the available training samples through the confirmation of the target label of the test sample, thereby improving the classification performance. CNN and SVM are used as basic classifiers to improve independent classification performance on the basis of updating training samples. At the same time, the two results are combined at the decision-making stage to obtain a more reliable recognition result. Experiments are carried out based on the MSTAR dataset, and the classification performance of the proposed method is tested under SOC, depression angle variance, noise corruption, and reduced training set, which is also compared with several existing methods. The experimental results show that the recognition performance of the proposed method is better than the existing methods under all conditions, verifying its effectiveness.
Data Availability
The MSTAR dataset is publicly available.
Conflicts of Interest
The authors declare no conflicts of interest.