Abstract
In order to handle the problem of synthetic aperture radar (SAR) target recognition, an improved sparse representation-based classification (SRC) is proposed. According to the sparse coefficient vector resulting from the global dictionary, the largest coefficient in each class is taken as the reference. Then, the surrounding neighborhoods of the sample with the largest coefficient are selected to construct the optimal local dictionary in each training class. Afterwards, the samples in the local dictionary are used to reconstruct the test sample to be identified. Finally, the decision is made according to the comparison of the reconstruction errors from different classes. In the experiments, the proposed method is verified based on the moving and stationary target acquisition and recognition (MSTAR) dataset. The results show that the proposed method has performance advantages over existing methods, which demonstrates its effectiveness and robustness.
1. Introduction
Synthetic aperture radar (SAR) is capable of measuring high-resolution images for effective ground observation and surveillance. Image interpretation technologies represented by SAR target recognition are widely used in military and civilian fields. SAR target recognition is a typical image pattern recognition problem, which aims to extract and classify the target of interest in SAR images [1, 2]. In order to improve the comprehensive performance of SAR target recognition, researchers extensively use advanced image feature extraction and classification algorithms for experimentation and verification. There are many types of features applied to SAR target recognition, including geometric shape features, projection transformation features, and electromagnetic scattering features. The geometric shape features describe the two-dimensional shape distribution of the target, such as area and contour [3–10]. The projection transformation features use mathematical projection or signal transformation algorithms to extract stable characteristics of the original images [11–16]. Electromagnetic scattering features describe the backscattering characteristics of the target in a specific radar frequency band, typically including the scattering centers [17–20] and polarizations [21]. In the classification stage, a suitable classifier is selected to confirm the class of the extracted features. Early classification strategies were mainly based on the idea of the nearest neighbors, such as K-nearest neighbor (KNN) classifier [11]. With the development of pattern recognition technologies, new classifiers such as support vector machine (SVM) [22, 23], multilayer perceptron (MLP) [12], and adaptive boosting (AdaBoost) [24] emerged. In recent years, the deep learning technology has become a new favorite in the field of image interpretation and has also been widely used in SAR target recognition [25–30]. Sparse representation-based classification (SRC) derived from compressive sensing theory was also widely used in pattern recognition and image classification [31–37]. Researchers introduced SRC into SAR target recognition and verified its feasibility. Since then, more works have continued to improve the overall recognition performance by optimizing the solution algorithm and decision-making mechanism [32–36].
Compared with other classifiers, SRC uses a linear fitting idea to evaluate the similarity between the test sample and each training class, so no pretraining is required. In addition, it is not difficult to find from the results of the existing literature that the sparse representation itself has a certain degree of robustness to the common extended operating conditions (EOCs) [38, 39] in SAR target recognition such as noise corruption and partial occlusion. Therefore, SRC has broad application prospects in SAR target recognition. In this paper, the traditional SRC is improved to enhance the performance in SAR target recognition. First, in the global dictionary, the test sample is reconstructed by SRC, and the sparse coefficient vector is obtained. Afterwards, the optimal local samples are selected in each classes according to a certain criterion. The criterion takes the sample with the largest coefficient as the reference and selects its surrounding neighborhoods to construct a local dictionary. Finally, the test sample is optimally reconstructed on the local dictionaries from different classes to obtain their corresponding reconstruction errors. The target class of the test sample is finally decided based on the principle of the minimum error. In the experiments, based on the moving and stationary target acquisition and recognition (MSTAR) dataset, the proposed method is tested under the standard operating condition (SOC) and three EOCs (configuration variance, depression angle difference, and noise corruption). Experimental results show that the proposed method can achieve superior performance over some existing methods in all four typical scenarios, verifying its effectiveness and robustness.
2. SRC
The sparse representation is based on the theory of compressive sensing and analyzes the characteristics of the sample by linearly representing the unknown sample on an overcomplete dictionary. Wright et al. first applied SRC in face recognition [31], that is, to determine the category of the test sample based on the reconstruction error of each class calculated based on the sparse representation coefficients. Specifically, a global dictionary composed of multiple training classes is first constructed, where represents the atoms corresponding to the training samples in the ith class. For the test sample to be identified, equation (1) is employed to perform the sparse linear represented:where is the sparse coefficient vector to be solved and is the set error threshold.
Since the direct solution of the optimization problem in equation (1) is very complicated, researchers tried to obtain high-confidence approximate solutions through the principle of equivalent approximation. For example, in [31], the norm was used to replace the original norm to convert the problem into a convex optimization one, which is easier to solve. In [32], an orthogonal matching pursuit algorithm (OMP) was developed based on a greedy mechanism to improve the overall solution efficiency. According to the solved sparse coefficient vector, the target class of the test sample can be judged according to its distributions in different classes. In [33], the decision was made according to the energy of nonzero coefficients in different classes. Among many principles, the criterion based on the minimum reconstruction error is the most widely used. The basic idea is to linearly reconstruct the test samples with samples of each class and then calculate the reconstruction error as follows:where is the coefficient vector distributed on the ith class and is the reconstruction error of the ith class to the test sample. Finally, the target class of the test sample can be determined by comparing the errors from different classes.
Although the decision criteria in the traditional SRC have certain validity, their characterization ability for each class is not sufficiently exploited. In the minimum reconstruction error criterion, all samples in each class are used for reconstruction. In fact, due to the azimuthal sensitivity of SAR images, the training samples related to the test sample should share a similar azimuth angle. Therefore, in order to obtain a better reconstruction result, the test sample should be reconstructed and analyzed on the local dictionary.
3. Improvement of SRC for Target Recognition
3.1. Improved SRC with Local Reconstruction
This paper makes some improvement on traditional SRC for SAR target recognition. Before implementing the traditional SRC, the atoms of each class in the dictionary are arranged in ascending azimuth order to ensure that the azimuths between adjacent samples are the closest. Afterwards, the global sparse coefficients are solved according to equation (1). Then, the optimal local dictionary is selected and constructed in each class. Taking the ith class as an example, the atom with the maximum coefficient is used as the reference sample, and the surrounding neighborhood samples are chosen. Because SAR images are sensitive to azimuth, it is generally believed that they can maintain a high correlation within the interval of ±5°. Therefore, this paper refers to this criterion in selecting surrounding samples. When the azimuth error is lower than 5° with the reference sample, the candidate is incorporated into the local dictionary. Based on the local dictionary, this paper performs the optimal representation of the test sample class by class as follows:where represents the local dictionary selected on a certain class, is the corresponding coefficient vector, and is the regularization coefficient. The above optimization has the following analytical solution:where represents the unit matrix.
Compared with the traditional SRC mechanism, the optimization problem in equation (3) emphasizes the reconstruction accuracy. In fact, in the local dictionary, the sparsity constraint is not established, and the optimal reconstruction is more effective. Finally, according to the coefficient vector solved on each class, the corresponding reconstruction error for the test sample can be solved according to equation (2). Finally, the target class of the test sample is determined according to the principle of the minimum error.
3.2. Target Recognition Procedure
According to the above algorithms, the basic process of the recognition method proposed in this paper is shown in Figure 1. It can be further decomposed into the following steps: Step 1: the test sample is processed by SRC based on the global dictionary formed by all the training samples and the sparse coefficient vector is solved Step 2: the optimal local dictionary for each class is established according to Section 3.1 Step 3: the optimal reconstruction of the test sample is performed on the local dictionary of each class and the corresponding reconstruction error is calculated Step 4: the target class of the test sample is determined according to the principle of the minimum error

In the specific implementation process, the principal component analysis (PCA) is used to extract feature vectors for all training and test samples so as to improve the overall classification efficiency. The OMP algorithm is used to solve the sparse coefficient vector resulting from the global dictionary.
4. Experiments
4.1. MSTAR Dataset
The performance of the proposed method is tested based on the MSTAR dataset. The dataset has been the authoritative benchmark for the testing and verification of SAR target recognition methods since its public release in the 1990s. Figure 2 shows the target in the dataset, including tanks, armored vehicles, and transport vehicles. The SAR image of each type of target contains omnidirectional azimuths and several depression angles, so that various conditions can be flexibly set to carry out experiments. According to existing literature, Table 1 gives a typical experimental setup, using SAR images of all the 10 types of targets at 17° and 15° depression angles as training and testing sets, respectively. In particular, the test samples of BMP2 and T72 targets have more configurations than their training samples (different configurations are marked by the notations in the parentheses). In general, under the experimental condition in Table 1, the differences between the training and test sets are relatively small, which can be approximated as SOC. In addition, according to the diversity of the image samples of the MSTAR dataset, several EOCs can be set or simulated, such as configuration variance, depression angle variance, and noise interference.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)
During the experiment, the proposed method is compared with several types of existing methods, focusing on comparison with traditional SRC-based methods, including the SRC1 method in [32], the SRC2 method in [33], and the SRC3 method in [34]. These three types of methods either adopt different decision-making mechanisms or introduce new solution constraints. In addition, a new method based on CNN is also set up in the comparison method, i.e., ESENet, proposed in [29]. The follow-up experiments specifically include 4 types: 1 SOC and 3 EOCs. SOC focuses on evaluating the basic recognition performance of the method, and EOCs verify the robustness of the method, mainly the reliability in complex scenarios.
4.2. SOC
At first, the performance of the proposed method is tested under SOC using the training and test sets in Table 1. In this case, the similarities between the test sample and the training sample are relatively high, so the difficulty of the recognition problem is relatively low. However, since Table 1 involves 10 types of targets, the correct classification still faces certain challenges. Figure 3 shows the recognition results of the proposed method under SOC. In the confusion matrix, the diagonal element is the correct recognition rate of the corresponding target. Considering the correct recognition samples of 10 types of targets, the average recognition rate of the proposed method in this paper reaches 99.04%, which shows its effectiveness. Table 2 compares the recognition results of various methods under this condition. The recognition rate of the proposed method is significantly higher than those of the traditional SRC-based methods, indicating that proper local dictionary selection and optimal reconstruction can improve the performance of SAR target recognition. Compared with the ESENet method, the recognition rate of the proposed method is slightly higher. Due to the configuration differences that occurred in BMP2 and T72 in the test set, the adaptability of the trained networks has declined to a certain extent. In summary, the proposed method has certain performance advantages under SOC.

4.3. EOC-1: Configuration Variance
Table 3 sets the training and test samples under the condition of configuration variance, including 3 types of targets. Among them, the test samples and samples of BMP2 and T72 targets are from different configurations. Table 4 shows the average recognition rates of different methods under this situation. The comparison shows that the proposed method can maintain the highest performance under configuration variance, showing its superior robustness. Compared with the other three types of SRC-based methods, this paper optimizes the local dictionary in the global sparse coefficients and performs the optimal reconstruction class by class, which further improves the effectiveness of the proposed method. Under configuration variance, there are only small differences in the local structure of the targets between the test and training samples. The reconstruction in a single class rather than the global dictionary is helpful to discover such subtle differences, thereby improving the recognition accuracy. Compared with the results under SOC, the recognition performance of the ESENet method decreases the most significantly, mainly because the influence of the configuration variances is further aggravated at this time.
4.4. EOC-2: Depression Angle Variance
The MSTAR dataset also includes SAR images of several types of targets at multiple different depression angles. As shown in Table 5, the training samples are images of 3 types of targets, i.e., 2S1, BRDM2, and ZSU23/4, from the depression angle of 17°; the test samples are from the depression angles of 30° and 45°, respectively. The large depression angle variance leads to a decrease in the similarity between the test and training samples, which brings about certain obstacles to correct recognition. Table 6 compares the average recognition rates of various methods at two test depression angles. The average recognition rate of the proposed method is 95.45% and 73.28% at the elevation angles of 30° and 45°, respectively, which has advantages compared with other methods. Especially at 45° with a notable depression difference, the advantages of the proposed method are more significant. Through proper local dictionary selection and optimal reconstruction, the characterization ability of different classes for the current test sample can be investigated to the greatest extent, so it can better adapt to the situation of depression angle variances.
4.5. EOC-3: Noise Corruption
In order to quantitatively test the noise robustness of the proposed method under different signal-to-noise ratios (SNR), the test samples in Table 1 are used as the benchmark and different degrees of Gaussian white noise are added to them according to the ideas in [19]. Figure 4 plots the recognition rate curves of various methods with the change of the SNR. The proposed method can achieve the highest average recognition rate at each noise level, showing its better noise robustness. Compared with the ESENet method, several types of methods based on SRC are generally more robust, especially in the case of low SNR, verifying that sparse representation has a certain robustness to noise interference. The proposed method examines the representation ability of each class on a reliable local dictionary and can deal with noise corruption more effectively through the optimal reconstruction process.

5. Conclusion
The paper proposes an improved SRC for the SAR target recognition. On the basis of the global sparse coefficients obtained from traditional SRC, the local dictionary is constructed according to each training class. Considering the azimuthal sensitivity of SAR images, this paper takes the sample with the largest correlation coefficient in each class as the reference and selects some of its neighborhood samples with approaching azimuths to construct the local dictionary with the strongest representation ability. Based on the local dictionary, the linear fitting is performed on the test sample according to the idea of optimal reconstruction, and finally the target class of the test sample is determined by comparing the reconstruction errors resulting from different classes. Based on the MSTAR dataset, the proposed method is tested under 4 typical conditions. The experimental results reflect that the proposed method can maintain superior performance under both SOC and EOCs, which proves its effectiveness and robustness for SAR target recognition.
Data Availability
The dataset used in this paper is publicly available.
Conflicts of Interest
The author declares that there are no conflicts of interest regarding the publication of this paper.