Abstract
Aimed to address the low diagnostic accuracy caused by the similar data distribution of sensor partial faults, a sensor fault diagnosis method is proposed on the basis of Grey Wolf Optimization Support Vector Machine (α-GWO-SVM) in this paper. Firstly, a fusion with Kernel Principal Component Analysis (KPCA) and time-domain parameters is performed to carry out the feature extraction and dimensionality reduction for fault data. Then, an improved Grey Wolf Optimization (GWO) algorithm is applied to enhance its global search capability while speeding up the convergence, for the purpose of further optimizing the parameters of SVM. Finally, the experimental results are obtained to suggest that the proposed method performs better in optimization than the other intelligent diagnosis algorithms based on SVM, which improves the accuracy of fault diagnosis effectively.
1. Introduction
The sensor functions as a major detection device in the monitoring system [1–3], the detection accuracy of which will be significantly reduced by breakdown. Additionally, it will affect the performance of the monitoring system and even result in economic losses and casualties in some extreme cases. Therefore, it is necessary to make an accurate diagnosis of sensor faults for ensuring that the monitoring system can operate smoothly and reliably.
When the fault intensity stays low, there would be some forms of sensor failure showing similar characteristics of data distribution, which is a leading cause for the low levels of diagnostic accuracy [4]. In traditional approaches to fault diagnosis [5–7], model-based methods require the establishment of an accurately mathematical model for the research object. In practice, however, it is usually difficult to construct the nonlinear system for mathematical models. With regard to knowledge-based methods, they rely heavily on expert experience, which makes them lack adaptability when new problems arise. Additionally, data-driven methods require the learning of historical data, rather than the exact mathematical models or expert knowledge.
With the rapid advancement of artificial intelligence (AI) technology, the AI-based diagnostic methods have attracted much interest for research in the field of fault diagnosis. In [8], a Recurrent Neural Network (RNN) is put forward to model nonlinear systems, thus achieving fault detection and the isolation of sensors. A very random tree method was proposed to detect and diagnose the faults in sensor networks in [9], which demonstrated strong robustness for processing the signal noise but ignored the fault diagnosis for sensor nodes. In [4], a hybrid continuous density HMM-based ensemble neural networks method is applied to detect and classify sensor node faults.
However, due to the similar distribution of some fault data, it is necessary to train a variety of classifiers for the accurate classification of different faults. Furthermore, a fault diagnosis method intended for chiller sensors is presented in [10], which not only achieves feature extraction by clustering the fault data but also identifies the fault types by setting the clustering indicators.
Abnormal data are considered the most effective indicators of sensor failure, which are nonlinear and enormous and make the data-driven intelligent diagnosis method more suitable for the diagnosis of sensor fault [11–13]. Machine learning algorithm is a commonly used method for intelligent diagnosis, including Neural Networks (NNs), Support Vector Machines (SVMs), and so on. However, the amount of fault samples is usually limited, which leads to a poor manifestation for NN. SVM has attracted much attention due to its capability of dealing with nonlinear and small sample size in fault diagnosis [14, 15], but the correct hyperparameters must be chosen for improved performance. Mechanism of different algorithms may be disparate, and the optimization of key parameters can often improve the performance of the algorithm [16, 17]. Researchers have proposed or improved algorithms to solve optimization problems [18–20] and achieved remarkable results, which gives us some inspiration to choose the appropriate hyperparameters of SVM. Besides, it is an effective strategy to improve the accuracy of diagnosis by adopting an appropriate method for extracting the feature of fault data. However, conventional data feature extraction methods such as Principal Component Analysis (PCA) [21] are more suitable for processing linear data. Also, time-domain parameters can also be taken as the reference indicators for diagnosis, but not all of them are sensitive to all sorts of failure [22].
In order to solve the aforementioned problems, there are a number of solutions proposed in this paper. Firstly, multiple time-domain parameters are extracted from sensor fault data, and the Kernel Principal Component Analysis (KPCA) is conducted to perform Principal Component Analysis of the time-domain parameters. Then, some of the time-domain parameters are refused to obtain the fusion features that can accurately reflect the characteristics of fault. Secondly, an Grey Wolf Optimization (-GWO) arithmetic is proposed to achieve parameter majorization for SVM. The competition mechanism is introduced to enhance the ability of algorithm to conduct search. In the meantime, the dominant position of wolf is reinforced to speed up convergence in the later stage of this algorithm. Finally, the samples comprised of the fusion features are inputted into different diagnostic models for the purpose of training and testing. The experimental results are comparatively analyzed to validate the method proposed in this paper for sensor fault diagnosis.
This paper is organized as follows. Section 2 briefly explains the improvement of GWO algorithm. Section 3 illustrates the fault diagnosis method based on -GWO-SVM. Simulation results and performance analysis are provided in Section 4. Contributions of the proposed method are given in Section 5.
2. An Improved Grey Wolf Algorithm
Grey Wolf Optimization (GWO) algorithm achieves the optimal outcome in the search of target by simulating the leadership hierarchy and the group hunting mechanism of the grey wolves. It shows advantages such as fast speed of search and satisfactory optimization effect [23]. However, there is still room for improvement in terms of the search strategy for the GWO [24, 25]. Therefore, an improvement is made to the proposed Grey Wolf Optimization (-GWO) algorithm as follows. The wolf pack is still divided into four levels, while default , , and wolves have strong search capability. Social rank is the highest in the population, and the remaining wolves are denoted as . The mathematical model for finding prey is expressed as follows:where represents the number of current iterations, and denote the synergy coefficients, indicates the location of the prey, and refers to the current grey wolf position, which linearly decreases from 2 to 0, while and stand for the random vector in [0,1]. In -GWO, a competitive relationship between the head wolves is introduced to improve the global search capability. Corresponding to the search target of the head wolves in each iteration, the fault classification error is taken as the score to obtain alpha score, beta score, and delta score. The head wolf level is rearranged according to the fault error score, and the wolf pack position is updated according to equations (2)–(4):where X represents the location of the wolf pack, while , , and refer to the distance between the current candidate wolf and the best three wolves. When , the wolves are dispersed in search of prey; when , the wolves start to concentrate on attacking their prey. While ensuring that the selected wolf has the strongest ability in the population, it is adjusted together according to the change of error and the number of current iterations for gradually enhancing the dominant position of the wolf. The improvement is expressed as follows:where represents the number of current iterations, indicates the maximum classification error, denotes the current classification error, and T refers to the total number of times of iteration.
3. Fault Diagnosis Method Based on -GWO-SVM
3.1. Data Preprocessing
In this paper, the data published online by Intel Labs [26] are used to perform fault injection in line with the existing methods [27]. Spike, bias, drift, precision drop, stuck, data loss, and random fault are injected into the original data. The raw data are shown in the appendix, and the fault sample obtained is shown in Figures 1–7.







3.2. Data Feature Extraction
The Kernel Principal Component Analysis (KPCA) is usually conducted to extract features and reduce the dimensionality of nonlinear data [28]. The main steps of KPCA are detailed as follows. Hypothesis is a collection of time-domain parameters, , is the vector of , and each vector comprises the time-domain parameters. The kernel matrix is calculated according to the following equation:
According to equation (7) [28], the new kernel matrix KL is obtained by modifying
The Jacobian matrix is applied to calculate the eigenvalues of kernel matrix and eigenvectors , and then the eigenvalues in descending order are sorted. The Gram–Schmidt orthogonalization process is followed to perform unit orthogonalization on the eigenvectors, so as to obtain . Then, components are extracted to obtain the transformation matrix:
Formula (9) is applied to convert the vector through the transformation matrix to , where refers to the extracted principal component vector.
The extracted principal components are fused with the time-domain parameters. The fused features not only contain the overall characteristics of the fault data but also reflect the local characteristics of the fault. Through multiple experimental comparisons, the mean, variance, crest factor, and skewness coefficient are taken as the reference indicators for the local features of the fault data, while the final fusion features are treated as samples.
In total, 342 groups of samples are selected for this experiment, with 242 groups taken as the training dataset and the other 100 groups treated as the testing dataset. Labels 1–8 represent spike, drift, bias, random, stuck, precision drop, data loss fault, and normal, respectively. The training set sample and testing set sample are listed in Table 1.
3.3. Establishment of -GWO-SVM Diagnosis Model
SVM provides an effective solution to the limited sample size and nonlinearity [29,30]. During model training and testing, the datasets usually consist of feature vectors and labels. The support vector is obtained by using the feature vector and label in the samples, and then the hyperplanes are established to separate different types of samples. More problems about Support Vector Machine mathematical modeling are detailed in [31]. The “one-to-one,” “one-to-many,” and “many-to-many” methods are used to address multiclassification issues [32].
The labeled fault data samples are used for SVM training, through using the samples and labels to build support vector, and then the hyperplane is established, so as to achieve the division of different types of sample data. In essence, the mathematical model of the multiclass SVM is a convex quadratic programming problem. A critical step is to determine the appropriate kernel function coefficient and penalty factor . The mathematical modeling process of the multiclass SVM is detailed as follows.
The objective function is constructed for convex quadratic programmingwhere represents the Lagrange multiplier, and indicate the input vector, denotes the category label, and refers to the kernel function. In fact, not all of the data can be linearly separated to the full, so that the hint loss is taken into consideration:where represents the normal plane vector, indicates slack variable, with each sample corresponding to one , representing the degree to which the sample does not meet the constraints, and denotes the penalty factor. The corresponding classification function is expressed aswhere represents the offset constant. The introduction of kernel function is effective in improving the ability of Support Vector Machine to deal with nonlinearity. In this paper, Gaussian kernel function with superior performance is applied:
It can be seen from equations (11) and (13) that both the penalty factor and kernel function parameter play an important role in determining the classification performance of Support Vector Machine. The penalty factor determines the degree of fit, and the kernel function parameter determines the scope of support vector, thus determining the generalization ability of the SVM. Therefore, choosing appropriate parameters is crucial for improving the accuracy of classification.
3.4. -GWO Algorithm Optimizes SVM
When the -GWO algorithm is applied to optimize the parameters of SVM, kernel function and penalty factor are the parameters to be optimized. Optimized flow chart is shown in Figure 8. The optimization process is detailed as follows:(i)Step 1: set the size of the wolf pack , the maximum number of iterations , and the search dimension , before initializing the location of the wolf pack.(ii)Step 2: initialize Support Vector Machine parameters and search range: , .(iii)Step 3: calculate the error scores of the three wolves under the current parameter to rearrange the level of the wolves.(iv)Step 4: with the smallest classification error of the alpha wolf after the election as the fitness value, update the wolf pack position according to equations (2)–(5).(v)Step 5: perform comparison with the fitness value of the previous iteration. If it falls below than the original fitness value, it will not be updated; otherwise, the fitness value will be updated.(vi)Step 6: perform cyclical calculation until the maximum number of cycles is reached, output at this time as the optimal parameters of the Support Vector Machine, and construct the SVM model.

In order to verify the effectiveness of the improved algorithm, the function is selected for testing as shown in Figure 9. Among them, , , , and .

Figure 10 shows the convergence curve after taking logarithm; -GWO has tended to converge after 100 iterations, while GWO has tended to converge after nearly 350 iterations, indicating that the convergence of -GWO is faster than that of GWO. In addition, -GWO is more accurate than GWO in searching for optimal values.

The testing dataset comprised of fusion features is inputted into the classifier for testing. Figure 11 shows the iteration number and error curve of GWO-SVM and -GWO-SVM. After 13 iterations of GWO algorithm, the classification error of SVM reaches 0.08, while -GWO algorithm reveals that the superiority is evident to the original grey wolf algorithm, and the classification error of SVM can reach 0.04 after 6 iterations. Moreover, it can be seen from the classification error that the -GWO algorithm performs better in parameter optimization for the SVM in each iteration, indicating that the improved algorithm has a better capability of optimization.

4. Simulation Results and Performance Analysis
4.1. Diagnosis Results
4.1.1. Diagnosis Result Comparison of the Before Feature Selection (BFS)
Fault data are obtained by fault injection of the original temperature data as mentioned in the previous section. Then, the mean value, variance, root mean square, peak value, peak value factor, skewness coefficient, and kurtosis coefficient are determined from the fault data, and the principal components of the seven time-domain parameters are extracted. Through the simulation experiment, the peak value, variance, peak factor, skewness coefficient, and the extracted principal component are finally selected and integrated to obtain the final dataset for SVM training and testing.
In this section, two parts of experiments are arranged. The first part is the comparison of the effect of the principal component extraction dataset and the fusion dataset, and the second part is the comparison of the effect of the SVM optimized by different algorithms.
The samples of the BFS are inputted into the -GWO-SVM diagnostic model for training and testing. Then, comparison is performed with the GWO-SVM and Adaptive Particle Swarm Optimization SVM (APSO-SVM) diagnostic model. The result of diagnosis is shown in Figures 12–14.



It can be seen from the comparison of diagnostic results that the APSO-SVM and GWO-SVM have misclassified multiple types of faults, suggesting the lowest ability to identify the data loss fault. -GWO-SVM makes a total of 9 sets of errors, and the performance is better than the others. In spite of this, there remain a variety of faults misclassified. It is evidenced that only the use of feature training model extracted by the KPCA leads to the failure of achieving an accurate diagnosis.
4.1.2. Diagnosis Result Comparison of the After Feature Selection (AFS)
The diagnosis results of the AFS are shown in Figures 15–17. According to the analysis of the diagnostic results, the APSO-SVM and GWO-SVM are more accurate, the number of groups that misclassify samples is smaller, and the classification performance has been significantly improved. It is demonstrated that the fused features can be effective in improving the reliability of diagnosis.



4.2. Comparative Analysis of Classifier Performance
Since this experiment is a multiclassification problem with the unbalanced distribution of samples [33], precision and kappa coefficient are taken into consideration for evaluating the performance of the classifier. Among them, precision represents the capability of classifier to distinguish each type of sample correctly, and a greater value indicates a better performance of the classification possess. The kappa coefficient evidences the consistency of diagnostic results produced by the classifier with the actual category of samples [34]. Besides, a greater value indicates a better performance of the classification possesses. The mathematical equations of precision and kappa coefficient are expressed as follows:
Precision. Calculate the precision of each label separately, with the unweighted average taken.where represents the number of true positive and refers to the number of false positive. indicates the capability of the classifier to diagnose a sample accurately, according to their respective class. means that the classifier diagnoses a sample inaccurately.where is the classification accuracy for all the samples, is the number of real samples of class , is the number of diagnosed samples of class , and is the total number of samples.
The performance index comparison results of the classifier are shown in Figures 18 and 19 and Tables 2 and 3, respectively. As for the BFS, the precision of -GWO-SVM reaches 93.83%, while the kappa coefficient reaches 89.91%. Besides, there are only 9 groups of samples which are wrong, indicating the best classification performance. In contrast to the GWO algorithm, precision has improved by 1.32%, while the kappa coefficient has increased by 2.24%, suggesting that the improved algorithm performs better in optimizing the parameters of Support Vector Machine.


With regard to the AFS, the classifier produces an excellent performance. Precision of -GWO-SVM is 97.29% and the kappa coefficient is 95.52%. Besides, there are as few as 4 sets of samples getting misclassified. As compared to the BFS, the precision is improved by 2.82% and kappa coefficient is increased by 4.49%, suggesting that the feature fusion is effective in enhancing the reliability of diagnosis.
5. Conclusion
The considerable contributions of the presented sensor fault diagnosis method in comparison to the previous approaches are summarized as follows:(i)In order to improve the accuracy of sensor fault diagnosis, an integrated sensor fault diagnosis approach based on the combination of data-driven and intelligent diagnosis is proposed in this paper. According to the results, this method is capable to achieve an accurate diagnosis of sensor fault when the failure intensity stays low.(iiIn order to fully extract the valuable information from the fault data, a method of feature extraction is put forward based on the fusion of KPCA and time-domain parameters, and experiments are conducted to demonstrate that the fusion feature improves the accuracy of diagnosis effectively.(iii)In addition, -GWO algorithm is proposed to optimize the parameters of SVM, thus enhancing the generalization ability of SVM. Through multiple comparison experiments and the analysis of performance indicators such as the precision and kappa coefficient, it is concluded that as compared to the other intelligent diagnosis algorithms based on SVM, the -GWO-SVM diagnostic method produces a better classification performance, and that the proposed method is effective in improving the reliability of diagnosis. In the future, the focus of research will be on the universality of this method proposed.
Data Availability
The data used to support the findings of this study are included within the supplementary information file.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This study was supported by the National Natural Science Foundation of China Program under grant no. 62073198 and by the Major Research Development Program of Shandong Province of China under grant no. 2016GSF117009.
Supplementary Materials
“Experimental data.docx” contains the dataset used for the experiment in this paper. (Supplementary Materials)