Abstract
This study is interested in extracting representative features of the ultrasonic phased array image intelligent classification of the internal defects of naval gun mounts. Thus, an improved sparse self-encoding network model (RSAE) is proposed to realize the re-expression of sample data. First of all, in intelligent classification, the deterministic initial weight will lead to the best or worst result of neural network training, but in complex problems, it is very likely to get the worst result; at the same time, the neural network uses random weights. The results of training multiple times fluctuate greatly, which is not conducive to the performance evaluation of the network model. Therefore, this paper does not directly use the correlation parameter between the feature and the defect category as the initial feature weight of the RSAE. Instead, given a cell, the correlation parameter between the feature and the defect category is located in this cell. Then, on this basis, the optimization goal is to minimize the reconstruction error of training sample data, minimize the deviation of similar sample data, and maximize the difference of sample data between classes to realize the re-expression of sample data. The experimental results show that the advanced features obtained by the improved sparse autoencoder proposed in this paper are better than the original features in pattern recognition. This network can be used to more accurately identify the types of internal defects in the welds of naval gun mounts.
1. Introduction
In the production process of naval gun cradle parts, due to the influence of welding process parameter deviation, environmental temperature, operator error, and other factors, there may appear to be defects in the weld of the workpiece that cannot be found by the naked eye. The lighter ones can greatly reduce the mechanical properties of the workpiece and shorten the service life of the workpiece, which bring huge losses to the national economy [1]. Ultrasonic phased array [2] was a complete nondestructive testing technology, which has been widely used in the industry. In particular, it can be used to detect welds with poor accessibility and welds of dissimilar metals. Traditional methods rely more on manual visual observation, but the difference in ultrasonic phased array detection patterns caused by different types of defects is often small for the human eye. Especially when the amount of inspection is large, this difference will be magnified, so the intelligent detection method with high reliability is urgently needed.
Intelligent qualitative technology has gradually been being used in various application scenarios, and related methods are developing fast, especially the one combining deep learning methods and defect recognition technologies, and there are many technological updates. At present, intelligent classification methods can be divided into classification methods used to manually extract features based on traditional machine learning and convolutional neural network (CNN) classification methods used to extract high-dimensional features based on automatic computers.
When extracting features manually through traditional machine learning, the focus should be on finding features with significant classification capabilities. In terms of searching and detecting features with significant classification capabilities, the statistical method to extract texture features is a widely used method, and texture features have good discrimination capabilities. Sambath et al. [3]and Wang et al. [4, 5] extracted features that are used by professionals for manual recognition during unintelligent classification such as mean variance energy. Polikar et al. [6] used wavelet discrete transform to extract features. Al-Ataxy [7] used wavelet packet and gray-level co-occurrence matrix to extract relevant features to classify weld defects. Theresa Cenate [8] extracted the average, standard deviation, energy, skewness, and kurtosis features as the input of the intelligent classification network. Haralick [9] proposed a texture quantization method based on gray-level co-occurrence matrix. Cui et al. [10] proposed a new advancing coupled multistable stochastic resonance method with two first-order multistable stochastic resonance systems. It is more conducive to extracting weak signal features. In fact, the above-mentioned research on defect classification feature extraction based on defect ultrasonic inspection atlas data has been extensively explored in the feature recognition of medical pathology images: Shakeel et al. [11] preprocessed lung images and then extract its spectral characteristics. In Kawashima et al.’s [12] article on distinguishing between normal bone density and osteoporosis, the author uses the gray run length method to extract features and then uses artificial neural networks for recognition. With osteoporosis observed, it can be seen that its long lines are increasing, short lines are decreasing, and the same situation also occurs with porosity defects, so we have reason to believe that this gray run length method can also extract the effective features of industrial porosity defects. Valentinitsch et al. [13] extracted local texture features in CT images, and local texture features have significant advantages in quantitative analysis of osteoporosis. Mugasa et al. [14] proposed an adaptive feature extraction model for thyroid classification. The above-mentioned research gives us the hint that suitable features should be extracted for a specific defect. Munir et al. [15] proposed a network model combining denoising autoencoder and convolutional neural network to classify weld defects. Gong et al. [16] proposed a deep transfer learning model to automatically detect inclusions in composite materials. Du et al. [17] proposed a feature pyramid network for nondestructive testing of automotive aluminum castings. Chang et al. [18] used support vector machines to classify thyroid nodules. Zhang et al. [19] used deep polynomial networks to distinguish malignant and benign breast tumors. Guo et al. [20] proposed an algorithm based on the optimized deep Q network (DQN), which speeds up the model convergence. Li et al. [21] presented a novel scheme by integrating variational mode decomposition (VMD) and random sparse Bayesian learning (RSBL, SBL-based prediction with random lags and random samples), namely, VMD-RSBL, for the forecasting task. Convolutional neural networks are powerful, but the training time is long and the requirements for samples are too high. In practical applications, as the number of data sets increases, retraining is inevitably required, and the time cost is too high. The essence of convolutional neural networks is also to find suitable and unique feature representations. Therefore, whether accurate and suitable feature representations can be extracted is the key to the success of the intelligent classification network.
In view of the research of the above-mentioned scholars, the biggest challenge in realizing the accurate and qualitative classification of the internal defects of the naval gun cradle with the ultrasonic phased array map is to extract the features that significantly respond to the classification task. Thus, this paper proposes an improved sparse autoencoder network model, which calculates the types of internal defects of naval gun cradle welds based on the Relief-F algorithm, and assigns the matrix composed of the sensitivity as the initial weight parameter of RSAE. The optimization objectives are to minimize the reconstruction error of training sample data, minimize the deviation of similar sample data, and maximize the difference of sample data between classes. This paper proposes that the advanced features obtained by the improved sparse autoencoder are better than the original features in pattern recognition, and the combination of RSAE and KELM is applied to the intelligent classification of ultrasonic phased array defect maps, which achieves higher accuracy. It can be used for the actual field inspection of welding workpieces of naval gun cradle.
2. Materials and Methods
2.1. Defect Image Data Collection
In order to evaluate the performance of the above-mentioned methods, this paper developed an ultrasound defect image database. Ultrasonic defect images are collected from standard welded test blocks and more natural defect test blocks. The collection scene is shown in Figure 1. The types of internal defects in the weld include slag inclusion, cracks, pores, incomplete penetration, and incomplete fusion.

(a)

(b)

(c)
The instrument used for ultrasonic defect image acquisition is a French M2M desktop phased array. The imaging method is full-focus imaging. A 10 MHz probe and 5 MHz are selected to collect 6368 ultrasonic defect maps by changing the scanning angle. The number of defects is shown in Table 1. The ultrasonic image of the internal defects of the weld is shown in Figure 2.

(a)

(b)

(c)

(d)

(e)
2.2. Defect Image Preprocessing
Aiming at resolving the problem that the noise reduction of the ultrasonic phased array signal through the low-pass smoothing filter cannot eliminate the noise in the amplifier frequency band, the wavelet filtering method is used to filter the high-frequency clutter cluster noise on the sparse all-focused spectrum of the ultrasonic phased array. This solves problems such as the interference of irrelevant texture information; then with the median filter technology, the focus should be on eliminating the speckle noise that is easy to form in the ultrasonic inspection and imaging process, highlighting the edge contour of the defect, and achieving the improvement of image contrast and edge enhancement; finally, the RGB threshold method is used to segment the denoised ultrasonic phased array focusing map, so as to filter out the interference of the detection coupling interface echo and the bottom surface echo, and eventually form a standard map for calculating the texture characteristics of the gray image of casting and forging defects. The changes before and after preprocessing are shown in Figure 3.

(a)

(b)
2.3. Defect Feature Extraction
Five feature extraction methods including the gray-level co-occurrence matrix, gray-level run length, gray-level difference statistics, Gauss Markov random field model, and Hu invariant are used to extract 35 feature parameters. The features extracted by various feature extraction methods are as follows:(i)Gray co-occurrence matrix: (1) mean energy, (2) standard deviation of energy, (3) mean entropy, (4) standard deviation of entropy, (5) mean moment of inertia, (6) standard deviation of moment of inertia, (7) correlation mean, and (8) correlation standard deviation(ii)Gray run length: (9) total run length, (10) total run length percentage, (11) short run advantage, (12) long run advantage, and (13) unevenness of gray scale distribution(iii)Gray difference statistics: (14) average value, (15) contrast, and (16) entropy(iv)Gauss Markov random field model: (17)∼(28) 12 4th-order GMRF parameters(v)Hu invariant moments: (29)~(35) 7 kinds of invariant moments
2.4. Principle of Relief-F Algorithm
Different feature values have different magnitudes, which have a great impact on the calculation of the correlation between features and categories. Therefore, before calculating the correlation between the feature and the defect category, the dispersion standardization method is used to normalize all the features, and the normalization calculation formula for the j-th feature of the sample is as follows:where is the value of the j-th feature after the sample is normalized, represents the minimum value of all samples on the j-th feature, and means that all samples are on the j-th feature.
The Relief-F algorithm is widely used in multiclassification tasks because of its high efficiency and speed. The method of calculating the correlation between features and defect categories is as follows.
Suppose the sampling times of the sample is K, the number of nearest neighbor samples is T, the correlation between each feature and the defect category is initialized to 0, and the sensitivity algorithm for one-time sampling update is as follows:(1)A sample X is randomly selected from the sample set Q, and the category of X is Y.(2)Find the T nearest sample set D of X from the same sample of X; represents the gth ( = 1, 2, …, T) sample of the same sample set. At the same time, find the samples of different types in RT nearest neighbor samples that constitute a different class sample set E; represents the nth (n = 1, 2, …, T) sample of the h-th nonhomogeneous sample set.(3)Calculate the update of the correlation W between the feature and the defect category under one sampling. The calculation method of the correlation between the j-th feature and the defect category is shown in the following formula:
Here, is the distance between sample X and on the j-th feature; is the distance between sample X and on the j-th feature. The distance on the feature represents the probability of the class of the sample L in the sample set Q, and represents the probability of the h-th sample in the sample set Q.
From the above steps, the correlation between each feature and the defect category under one sampling can be calculated. Under K sampling, the correlation between all features and the defect category is updated K times.
2.5. Initial Weight Parameter Generation
The deterministic initial weights will lead to the best or worst results of neural network training, but in complex problems, it is very likely that the worst results will be obtained; at the same time, the neural network training with random weights will fluctuate many times. It is not conducive to the performance evaluation of the network model. For the above reasons, this article does not directly use the correlation parameter between the feature and the defect category as the initial feature weight of RSAE, but giving a cell, the cell contains parameters related to features and defect categories. Methods are as below.
First, sort out the correlation parameters of features and defect categories from small to large; then start with 0, and the maximum value of the correlation parameter increases by 0.2 as the end point to determine a large interval; if the difference between the corresponding correlation parameters between the features is less than or equal to 0.2, then treat these features as the same group, and if not, use a single feature as a group; increase the maximum correlation coefficient in each group by 0.2 as the dividing point, and divide the large interval into multiple cells. The cells corresponding to each feature are as follows (Table 2).
Multiply the cell area in Table 2 by one-thousandth as the sensitivity parameter interval of the feature and V-shaped weld defect type, and assign the random number in the sensitivity parameter interval as the initial weight parameter of RSAE to make each feature and its corresponding sensitivity multiply the degree parameters.
2.6. Sparse Autoencoder Network Model
The basic idea of an autoencoder (AE) is to learn its input and then decode the data with the smallest error from the input as the output. The AE is an unsupervised learner.
AE consists of three layers, namely, the input layer, the hidden layer, and the output layer, as shown in Figure 4. The AE maps the input vector to the hidden layer h. The decoder maps the hidden layer to the output layer . The encoder can be expressed as follows:where represents the activation function, is the weight matrix, and is the bias vector. The decoder can be expressed as follows:where is the weight matrix and is the bias vector.

The difference between the input vector and the output vector is defined as the reconstruction error, that is, the loss function of AE:
The sparse autoencoder [22] (SAE) adds the sparse penalty term to the objective function of the autoencoder, so that the learned features are binding. The loss function of SAE is as follows:where represents the number of neurons in the hidden layer, i is each input vector, m is the input vector dimension, λ is the weight of the penalty term, β is the sparsity weight, and j represents each neuron in the hidden layer.where ρ is a sparsity parameter, usually the value close to 0, and
represents the average activation of j.
2.7. Nuclear Extreme Learning Machine
The extreme learning machine (ELM) is a single hidden layer feedforward neural network, which is composed of an input layer, a hidden layer, and an output layer. The input layer and the hidden layer and the hidden layer and the output layer are fully connected. The connection weight between the input layer and the hidden layer is W, the bias vector is b, and the connection weight between the hidden layer and the output layer is β. ELM can randomly generate W and b before training. The number of hidden layer neurons and the activation function of hidden layer neurons needs to be determined to calculate β. The learning algorithm of ELM mainly has the following steps:(1)Determine the number of neurons in the hidden layer, and randomly set the connection weight W between the input layer and the hidden layer, and the bias of the hidden layer neurons b.(2)Select the activation function of the hidden layer neuron, and then calculate the hidden layer output matrix.(3)Calculate the connection weight between the hidden layer and the output layer.
The Kernel Extreme Learning Machine (KELM) is an improved algorithm based on the extreme learning machine combined with the kernel function. KELM can improve the prediction performance of the model while retaining the advantages of ELM.
2.8. Improved Sparse Autoencoder Network Model
The autoencoder is a single hidden layer neural network, which belongs to unsupervised learning. The output is approximately equal to the input, and a good expression form of image data is obtained. The sparse autoencoder adds regularization constraints to the autoencoder. This method simulates the human brain and only allows some neurons in the hidden layer to respond to specific characteristic stimuli to obtain efficient data representation. In this paper, an improvement is made on the basis of the sparse autoencoder. The sparse autoencoder is designed, and the specific initial weight parameter is used to replace the random initial weight of the traditional sparse autoencoder, and the three constraints of the network are given to obtain higher-level abstract features and their associations and finally achieve excellent expression of sample data. The network model is shown in Figure 5.

Construct a supervised feature coding learning objective function of gray image texture feature set that significantly responds to the types of internal defects in the weld:(i)The training sample data reconstruction error is the smallest: Here, H is the output function of the hidden layer of the extreme learning machine, G is the activation function, and N is the number of training samples.(i)Restrictions on minimizing the difference of sample data within the class: Here, , ; represents the similarity between samples; if and belong to the same kind, then ; otherwise, .(i)The constraint of maximizing the difference of sample data between classes: Here, represents the difference between samples; if and belong to the same kind, then ; otherwise, .
To achieve the learning goals, the research minimizes the reconstruction error of the input training data, minimizes the distance within the feature class of the sample data in the high-order feature space, and maximizes the distance between the classes; the comprehensive objective function is as follows:
According to the gradient descent strategy on the output weight parameter β, (12) is solved, when ∇β = 0:
Here, I is the identity matrix, , is the diagonal matrix, and its diagonal elements , ; similarly, , is a diagonal matrix, and its diagonal elements are , .
(13) belongs to the Sylvester equation, so the sylvester () function in the MATLAB software is used to solve the problem, and the optimal solution of the output weight parameter β is as follows:
Re-expression of training sample data X is as follows:
3. Experiments and Results
3.1. Experimental Setup
This experiment was carried out under the Windows 7 system, the programming environment was MATLAB2015b, and the hardware environment was CPU inter(R) Core(TM) i3-4150, Memory 32 GB, and 64-bit operating system.
In this experiment, 60% of the defect image data is randomly selected as the training set, and 40% of the data is used as the test set. The extreme learning machine randomly sets the connection weight and directly calculates the result by solving the equation; the support vector machine searches for the maximum interval plane classification; the Resnet-34 network reduces the difficulty of deep network training by using the residual structure. The kernel functions used by the Kernel Extreme Learning Machine (KELM) and Kernel Support Vector Machine (KSVM) are all Gaussian kernel functions. In the following experiment, the accuracy of Resnet-34 is the highest accuracy of its 20 iterations, and the accuracy of KELM and KSVM is their average accuracy of 20 trainings. The parameters used by the sparse autoencoder are shown in Table 3.
Figure 6 shows the entire process of this experiment. ① The impact on classification accuracy of image data before and after preprocessing are compared. ② The classification accuracy and rate of the improved sparse self-encoding network coding features and the original features are compared. ③ The best classification network and performance parameters are selected to solve the problem of ultrasonic phased array image classification of internal defects in V-shaped welds.

3.2. Analysis of Results
As shown in Figure 7, after the original image preprocessing, the classification results of the Resnet-34 network, KELM, and KSVM are all unsatisfactory and are significantly lower than the classification results before the preprocessing. Therefore, the ultrasound defect image contains information that significantly responds to the classification task.

As shown in Figure 8, using the original features as the input of KELM and KSVM, the accuracy rates are 86.5% and 81.3%, respectively; on the contrary, using the features encoded by the improved sparse autoencoder as the input of KELM and KSVM, the accuracy rates are 95.5% and 90.7%, respectively; the accuracy rate was increased by 9% and 9.4%, respectively. The encoded features are high-level features of the image, which are more conducive to classification network recognition.

As shown in Figure 9, the results of multiple trainings are that the accuracy of RSAE is greater than that of YSAE and that of SAE, and the stability of the RSAE network is higher than that of the other two networks. This result proves that calculating the sensitivity of each feature to the internal defect type of the V-shaped weld before the network training and assigning it as the initial weight parameter of the RSAE can effectively improve the accuracy of the network and increase its stability.

The efficiency in Table 4 is the time required for 20 iterations of Resnet-34 and the time required for KELM and KSVM training 20 times (the time required for KELM and KSVM training and testing adds the time used for RSAE encoding).
As shown in Table 4, the time spent on KELM training and testing is much less than the time needed for Resnet-34 network training and testing. In practical engineering applications, as the amount of standard image data increases, the network model must be retrained. In order to obtain the optimal solution with the Resnet-34 network, the number of model iterations should reach more than 1,000, and the training time should reach more than 150 hours. (Do not consider the increase in the amount of data and the increase in the training time of the model.) The time required for training and testing of KELM and KSVM is not much different, but KSVM needs to determine multiple segmentation planes in multiclassification tasks, resulting in accumulation of classification errors and a decrease in classification accuracy.
Set the ratio of the number of hidden layer neurons to the number of input features to be 0.1, 0.3, 0.5, 0.7, 0.9, 1.1, 1.3, 1.5, 2, 10 to evaluate the performance of RSAE.
As shown in Figure 10, when the ratio of the number of hidden layer neurons to the number of input features reaches 0.9, the classification accuracy of KELM reaches 97.1%. When the ratio continues to increase, the classification accuracy of KELM is basically stable, and the increase in the number of neurons makes the model training time longer, so the number of neurons in the RSAE hidden layer is set to 32.

4. Conclusion
If the structural strength of the naval gun cradle is weak and the key force-bearing area has defects and missed inspections, the missing defects continue to expand under the load and failure problems such as large deformation or instantaneous fracture will inevitably occur, which directly leads to weapon equipment loss of combat effectiveness. Therefore, it is necessary to find an accurate nondestructive testing method to inspect the welded structure in use. Due to the irregularity of the internal defect shape of the weld, the defect shape in the inspection map is changeable and the noise is messy, making the qualitative analysis of the inspection map manually a complicated task. In order to solve this problem, we propose an improved sparse autoencoder combined with a nuclear extreme learning network model. With this method, some features can be found to significantly respond to the task of classification of internal defects in the welding seam of the naval gun cradle, and the nuclear extreme learning machine can be used to feature parameter classification. The innovation of this paper is to calculate the sensitivity of each feature and assign the matrix composed of the sensitivity as the initial weight parameter of RSAE and minimize the reconstruction error of training sample data, minimize the deviation of similar sample data, and maximize the difference of sample data between classes to realize the re-expression of sample data. A large number of experiments and comparisons have proved the effectiveness and efficiency of the proposed improved sparse autoencoder model.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This work was funded by the https://doi.org/10.13039/501100001809National Natural Science Foundation of China (NSFC) (grant number 52075270); Science and Technology Plan Project of Inner Mongolia (grant number 2020GG0160); Young Science and Technology Talents Support Plan Project of Inner Mongolia (grant number NJYT22063); https://doi.org/10.13039/501100004763Natural Science Foundation of Inner Mongolia (grant number 2019MS05041); and Technical Basic Research Project of National Defense Science and Industry Bureau (grant number JSZL2018208C004).