Deep Forest-Based Fault Diagnosis Method for Chemical Process

Ding, Jiaman; Luo, Qingbo; Jia, Lianyin; You, Jinguo

doi:https://doi.org/10.1155/2020/5281512

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusions Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 5281512 | https://doi.org/10.1155/2020/5281512

Deep Forest-Based Fault Diagnosis Method for Chemical Process

Jiaman Ding,¹Qingbo Luo,¹Lianyin Jia,¹and Jinguo You¹

Academic Editor: Sergey A. Suslov

Received30 Aug 2019

Revised20 Dec 2019

Accepted02 Jan 2020

Published25 Jan 2020

Abstract

With the rapid expanding of big data in all domains, data-driven and deep learning-based fault diagnosis methods in chemical industry have become a major research topic in recent years. In addition to a deep neural network, deep forest also provides a new idea for deep representation learning and overcomes the shortcomings of a deep neural network such as strong parameter dependence and large training cost. However, the ability of each base classifier is not taken into account in the standard cascade forest, which may lead to its indistinct discrimination. In this paper, a multigrained scanning-based weighted cascade forest (WCForest) is proposed and has been applied to fault diagnosis in chemical processes. In view of the high-dimensional nonlinear data in the process of chemical industry, WCForest first designs a set of relatively suitable windows for the multigrained scan strategy to learn its data representation. Next, considering the fitting quality of each forest classifier, a weighting strategy is proposed to calculate the weight of each forest in the cascade structure without additional calculation cost, so as to improve the overall performance of the model. In order to prove the effectiveness of WCForest, its application has been carried out in the benchmark Tennessee Eastman (TE) process. Experiments demonstrate that WCForest achieves better results than other related approaches across various evaluation metrics.

1. Introduction

Performance improvement and surveillance facilitation have become increasingly important in industrial processes. Accompanied by extreme conditions, modern industrial processes are becoming more and more complex. In the case of underdeveloped monitoring technology and lack of historical fault data, diagnosis technology mainly consists of two types of diagnosis methods based on process and knowledge [1, 2]. They make the diagnosis results easy to understand, but the use cost is too high for systems with many devices and large state variables [3]. However, the modern industries are developing in the direction of large scale and complexity, and with the widespread use of monitoring technology, large volumes of industrial process data have been collected from broadly deployed sensors and other control equipment. Therefore, to maximize use of these massive data to further improve both accuracy and speed of fault diagnosis is significant for a complicated process monitoring system.

With the increase of storage capacity and computing power, data-driven fault diagnosis methods have been widely used in chemical processes [4, 5]. Among these methods, the multivariate statistical method, mainly including principal component analysis (PCA) [6, 7], partial least squares (PLS) [8, 9], independent components analysis (ICA) [10, 11], Fisher discriminant analysis (FDA) [12, 13], random forest (RF) [14], canonical correlation analysis (CCA) [15], exponential discriminant analysis (EDA) [16], and their derivatives [17–22], have also made a rapid progress. Although certain effects have been achieved by these data-driven methods, there are still two shortcomings: On one hand, most of these methods rely on an assumption of a single data distribution (e.g., Gaussian distribution) [23, 24]. But in actual industrial processes, data do not always strictly follow a certain distribution. Therefore, expert experiences will be needed for these methods. Approximate hypothesis can also be used to process these data, but diagnostic errors may be generated. On the other hand, in the context of big data, the above methods are easy to be saturated for sample data; that is, when sample size increases to a certain scale, it is difficult to further utilize the remaining sample data to improve the fault diagnosis accuracy.

In order to maximize the use of massive data, in recent years, deep learning (DL) has been applied to various fields of big data, and a large number of DL based fault diagnosis methods have emerged [25–28]. Xie and Bai [29] proposed a hierarchical deep neural network (HDNN) to diagnose faults in the benchmark Tennessee Eastman (TE) process. By training a monitoring deep neural network (DNN), the faults are divided into several groups. For each group, a special DNN trained is triggered for further diagnosis. Zhang and Zhao [30] presented an extensible deep belief network- (DBN-) based fault diagnosis model. The features of fault data in spatial and temporal domains are extracted by DBN subnet, and then fault classification is carried out by the global back propagation network. Moreover, a deep convolution neural network- (DCNN-) based fault diagnosis method was also proposed [31], which achieves better results than the former one. However, some shortcomings may limit the application of DNN in fault diagnosis: (1) DNN is mainly used to process the spectrogram of image and speech recognition in computer vision, and in order to extract both spatial and temporal features, the input data in fault diagnosis are usually processed to a two-dimensional data matrix composed of a period of time [29–31]. So, it may result in a low real-time performance. (2) It is well known that the performance of DNN depends largely on parameter adjustment because of a large number of hyperparameters.

In order to alleviate the aforementioned shortcomings of DNN, an alternative of DNN, gcForest [32], was proposed in 2017, which can achieve comparable or even better results than DNN on several domains. gcForest has much fewer hyperparameters than DNN and can be easily trained without too many parameter-adjustment skills. However, in gcForest, two key issues, the diversity of classifiers and the power of each classifier, should be paid attention on. For the former, different forests can be used, such as random forest, completely random forest, and so on. For the latter, in this paper, a weighted cascade forest (WCForest) model is proposed. The main idea of WCForest is to design a strategy to set weight for each forest in cascade structure and to improve the performance of the good forests and to restrain the bad ones.

The remainder of this paper is organized as follows. Section 2 introduces the principle and mathematical model of gcForest. The WCForest-based fault diagnosis model is proposed in Section 3. The applications of WCForest in the TE process and the comparisons with other fault diagnosis methods are discussed in Section 4. Finally, conclusions are drawn in Section 5.

2. Multigrained Cascade Forest

gcForest consists of two integrated components: the multigrained scanning and the cascade forest. The multigrained scanning adopts sliding window to scan local context from high-dimensionality to learn representations of input data by different forests. The cascade forest learns more discriminating representations under the supervision of input representations at each level, so as to give a more accurate prediction according to the ensemble of forests.

2.1. Multigrained Scanning

Inspired by feature relationships of CNN, the cascade forest adopts a sliding window-based multigrained scanning strategy. An illustration of its process is given in Figure 1. Suppose that there are instances of classes in the training dataset and the dimension of each instance is . A sliding window of size is used to scan each instance, and the -dimensional feature vectors can be generated by scanning each raw instance sequentially. All feature vectors extracted from the raw instance are regarded as derived instances of this class. For each -dimensional derived instance, each forest generates -dimensional class vector. The -derived instances of each raw instance are input into random forest and completely random forest to generate their class distribution vectors and then to concatenate them into transformed feature vector of -dimensional. As shown in Figure 1, the training dataset includes three classes and each raw instance has 400 dimensions and the sliding window size is 100. Therefore, from the above process, a feature vector of an 1806-dimensional transformed feature vector corresponding to a 400-dimensional raw feature vector is obtained. Compared with the raw vector, the transformed feature vector has much more dimensions and an enhanced feature representation.

2.2. Cascade Forest

In the cascaded forest, each cascade layer assembles many decision forests, receives the features processed by its previous layer, and inputs its processing results to its next layer. In fact, each layer is designed to include different types of forests to encourage overall diversity. Figure 2 shows the schematic of an example cascade forest, in which two types of forests (random forest in green and completely random forest in blue) are used. The number of forests per layer and the number of trees in each forest are hyperparameters in practice. The instances are input to the cascade layer, and each forest produces an estimate of class distribution. The class distribution outputs of all forests in the same layer form a class vector, which is then connected with the raw vector as an input of the next cascade. Cross-validation is used to evaluate the overall expansion performance of the new layer. When there is no performance improvement, the expansion progress will be automatically terminated.

For each instance, each forest will generate an estimated vector of class distribution by averaging classification probability of all trees in the same forest. The classification probability of a tree is obtained by calculating the proportion of different classes of training instances at the leaf node where the concerned instance falls. The process of the distribution characteristics of random forests is shown in Figure 3. Suppose the class-distribution vector obtained by the tree in the forest is , where represents the number of classes and each forest contains trees, then the class distribution vector generated by the forest is , where .

3. WCForest-Based Fault Diagnosis Method

3.1. Weighted Cascade Forest

As a substitute for DNN, the cascade forest learns hyperlevel representation in a low cost. It does not learn hidden variables based on complex forward or backward propagation algorithms in DNN. Instead, it directly learns class-distribution features by assembling a large number of decision-tree-based forests under the supervision of input. The layer-wise supervised learning strategy makes cascade forests easy to be trained. Moreover, the ensemble of forests can acquire more precise class-distribution features, owing to its powerful ability in most classification applications. However, in a standard cascade forest model, all forests in each cascade structure contribute equally to the final prediction, which may result in a sensitive estimation of classification distribution to the amount of forests fitting. In order to alleviate this problem, based on cascade forest, this section introduces a new variant of cascade forest, WCForest.

Inspired by weighted voting, we give higher weights to excellent classifiers than poor classifiers in the training process of the cascade structure. Obviously, it is difficult to define rules to set weights to the forests in the cascade structure. On one hand, the sample set of training forests are random, but the result of a single classification is not suitable for measuring the quality of the forest. On the other hand, extensive calculation and estimation of weights may bring additional costs. In this study, we attempt to set weights for the forests as objectively as possible without additional costs.

Specifically, the performance of the forest can be measured by training results of different subtraining samples. In order to mitigate the risk of overfitting, cross-validation is used to evaluate the overall performance of each layer. Therefore, the classification accuracy of cross-validation for each forest can be used to estimate its weights. The reasons for using cross-validation to calculate weights are as follows: (1) Cross-validation itself is a default way to evaluate the performance of new layer in the cascade forest, so using it as a strategy for calculating weights of forests does not incur additional computational costs. (2) Cross-validation determines the weight of the forest classification quality through multiple verifications, which eliminates the contingency of verification.

Assuming that there are classes in the training set, the weight of each forest in each layer is estimated by -fold cross-validation. The training set is divided into subsample sets, one of which is retained as a verification set, and the other subsample sets are used to train the forest. Cross-validation is repeated times. Each subsample set validates a random forest at one time, leading to a classification accuracy. After training and verifying each level of the cascade forest, the classification accuracy matrix ACC can be generated as follows: where denotes the accuracy of the cross-validation of the forest, represents the number of random forests in each cascade structure, and is the number of cross-validations.

According to ACC, the average classification accuracy of each forest is as follows:the weight matrix is defined as follows:where represented the weight of forest, which is calculated as follows:

Given a new instance, each forest produces an estimate of the class distribution as described in [29]. Assuming that the class distribution vector obtained by the random forest in a cascade forest is , then the weighted class probability vector of the next cascade structure is and is connected with the raw vector together as an input to the next layer.

If the current layer is the last layer of the model, the class distribution matrix of the cascade forest is as follows:

The weighted class probability results can be calculated as follows:where represents the total probability of class .

Finally, the class with the maximum probability is chosen as the fault classification results, as shown in Figure 4.

3.2. WCForest-Based Fault Diagnosis Model

The process data of industrial processes are usually high-dimensional and noisy. Generally, the original input space is mapped to the feature space by feature extraction. However, the effect of feature extraction will directly affect the performance of the classifier. The two randomness of random forest make it to have better antinoise ability, and when the input data have high dimensionality, the representative learning ability can be further enhanced by multigrained scanning, which may make WCForest have a context or structure awareness. Based on the WCForest fault diagnosis model, the data extracted from each monitor in the industrial process are diagnosed and will get the evaluation status of data at each time.

In this paper, the process of the model consists of two parts: multigrained scanning-based feature extraction and weighted cascade forest-based fault diagnosis. After the data are collected, we use multigrained scanning to extract representation vectors from training and testing sets. Then, the weighted cascade forest classification model is trained by the representation vectors of training set and validated by the representation vectors of testing set. Finally, we obtain the classification results of testing set. The flow chart of the model is shown in Figure 5. Its diagnostic procedures include offline modeling and online diagnosis, described as follows: Offline stage: Step 1. Historical data are collected and preprocessed from the chemical process. Step 2. Data collected at each time are composed into -dimension vectors and labeled with their corresponding classes, including “normal” and their fault types. Step 3. The samples including their corresponding labels are divided into the training set and the testing set. Step 4. Given several different sets of windows, use the training set to select a set of windows from them for multigrained scanning. Step 5. The class probability vectors of training set and testing set is obtained by multigrained scanning of selected set of windows. Step 6. In training set, the k-fold cross-validation is used to train the WCForest model, and obtain the weight vector of each layer. Verify the WCForest model using the class probability vectors of testing set. Step 7. The fault diagnosis result is outputted and visualized. Online stage: Step 1. Online data are collected from the chemical process. Step 2. Online sample vectors are input to the WCForest, which can give a predicted diagnosis result for each sample vector. The diagnosis result is either “normal” or one specific fault type.

4. Experiment Result

In this section, the proposed WCForest-based fault model is applied to the TE process. Furthermore, the results of the proposed method are compared with other decision tree-based ensemble methods (RF, XGBoost, AdaBoost), gcForest, and existing literature.

4.1. Tennessee Eastman Process

As a real industrial process-based simulation platform, the Tennessee Eastman process is widely used to evaluate the performance of monitoring methods in the field of data-driven fault detection research. Figure 6 shows a flow diagram of the TE process. The process consists of 5 major unit operations: reactor, product condenser, vapor-liquid separator, recycle compressor, and product stripper. In addition, the process consists of four reactants—A, C, D, and E—and an inertia component B. The 4 reactants and the inertia component are sent to the reactor, and then the process produces liquid products G, H, and a byproduct F. The reaction process is irreversible, exothermic, and approximately first-order with respect to concentrations.

The TE process includes 41 measured variables and 12 manipulated variables. However, one of the manipulated variables, reactor speed, is always constant, and does not need to be analysed. The remaining 52 variables are used as research variables [33], which are all listed in Table 1. The first 41 variables are measured variables and the last 11 variables are manipulated variables. The TE process contains 21 faults which are listed in Table 2. The data used for faults classification of TE simulation system can be downloaded from http://web.mit.edu/braatzgroup. Each state (normal state and 21 different fault states) is divided into training and testing sections. The data are sampled once every three minutes. The training data are sampled 500 times for 25 hours and faults are introduced after one hour, so the simulation only uses the remaining 480 fault samples to a fault diagnosis model. The testing data are sampled 960 times for 48 hours, and faults are introduced after 8 hours; that is, the fault samples were collected from the 161th sampling point. With these normal sample set and fault sample sets, a completed WCForest model can be trained and tested.

4.2. WCForest Model for TE Process

The WCforest model suitable for TE process fault diagnosis is designed and constructed, in which the construction of forest and the setting of some hyperparameters need to be selected experimentally, such as the number of trees in each forest, the number and types of forest, and the setting of feature window. To find a suitable model, we tried the following experiment.

In the process of constructing a decision tree, information gain and Gini index are generally used as heuristic functions for feature selection. In this paper, we tested these two feature selection rules, respectively, and the test accuracy obtained was not significantly different. Therefore, relatively good Gini index was selected as the node splitting rule of the random forest model in this paper.

For hyperparameters in the model, the numbers of RF in the multigrained scanning and the cascade structure are set according to the setting in the literature [29]. 2 random forests (1 completely random forest and 1 random forest) and 8 random forests (4 completely random forests and 4 random forests) were set up in each layer of the weighted cascade forest, respectively. However, the number of trees in forest and the setting of scanning window have a great influence on the diagnostic accuracy, so its parameters need to be optimized. First of all, we discuss the value of , which has little effect on the diagnostic accuracy when trees. Considering the relationship of time complexity, fixed . It is a common issue that there is no scientific guidance for the setting of windows. In order to find a proper setting, we tried several window settings: [15, 30], [20, 45], [13, 30, 42], [18, 36, 45], [16, 27, 35, 42], and [18, 25, 36, 47].

Here the dataset samples with 400 samples of one class are randomly selected; 80% of each class samples are training dataset, and others are testing dataset. The fault diagnosis testing is on one sample each time. The testing average diagnostic accuracy and the training/testing time of the different window settings are listed in Table 3. The window setting of [18, 25, 36, 47] has the highest testing average diagnostic accuracy (75.9%) and takes 83.4 min for training. With a little decrease of the testing average diagnostic accuracy (75.6%), the window setting of [18, 36, 45] takes 20 min less than the setting of [18, 25, 36, 47]. In the following discussion, [18, 36, 45] is chosen as the best window setting.

4.3. Fault Diagnosis Result

Experimental results of fault diagnosis for display, two commonly used indicators, fault detection rate (FDR) and false positive rate (FPR), are considered to evaluate the diagnostic performance of the model, and they can be calculated by the general confusion matrix defined in Table 4, which are shown in the following equation:

Table 5 shows the FDRs of 21 faults in the TE process obtained by WCForest, gcForest, and three decision tree-based integration methods: Random Forest (RF), XGBoost, and AdaBoost. The setting parameter of gcForest is the same as WCForest. The parameters of the remaining algorithms are set as follows (which are all set by multiple parameter adjustments):(1)RF: 400 decision trees, Gini index as classification rule.(2)XGBoost: 400 decision trees, that is, the number of iterations, learning rate = 0.1, softmax loss function as objective function.(3)AdaBoost: 500 decision trees, learning rate = 0.6.

Compared with the diagnostic results of RF, the FDRs of most faults have increased in varying degrees, which is of great significance in industrial production and theoretical research. RF and XGBoost have a good performance for faults with obvious feature differences, such as 1, 2, 6, 7, and so on, and a low performance for well-known faults, such as 3, 9, and 15. AdaBoost generally has a low diagnostic rate, with the exception of fault 2 reaching nearly 100%. gcForest enhances the perception of differences between features in cascade structure and improves the classification ability of the model through multigrained scanning representation learning. And the weights in WCForest can improve the robustness and sparsity of the model.

WCForest has the best performance in these five methods, with an average FDR of 84.13%, which is about 62.18% higher than AdaBoost. The FDR of fault 6, 7, and 21 increased by 100%, the highest improvement among all 21 faults, and the improvements of more than 70% faults exceed 50%. For RF and XGBoost, their diagnostic rates for all faults are similar. The average FDR increased by 24.21% and 15.77%, respectively. Compared with gcForest, the performance of WCForest is improved by nearly 2%. The FDR for fault 2, 6, 7, and 21 is 100%, which means there are no false alarms and missing alarms. Furthermore, the FDR for 11 faults exceed 90% and the FDR for 6 faults exceed 95%, which is an important achievement.

A performance comparison of the five methods is shown in Figure 7. Obviously, WCForest and corset outperform RF, XGBoost, and AdaBoost. Compared with gcForest, the performance of WCForest is slightly improved.

To further demonstrate the validity of WCForest for the TE process, the FPR is shown in Table 6. In addition, RF, XGBoost, Adaboost, and gcForest are compared.

In Table 5, WCForest has an average FPR of 2.45%, a 27.58% decrease compared to AdaBoost and a 0.37% decrease compared to gcForest. The FPRs of fault 1, 2, 6, 7, 8, 17, 18, and 21 are zero, which is of great significance in industry. In addition, the FAR of a half of faults is reduced by more than 40% compared to AdaBoost. Figure 8 shows the detailed comparison results.

In order to examine the performance of WCForest, we compare it with methods listed in Table 7, which shows that our fault diagnosis model has a better performance than the others. Except for faults 3 and 15, the other 19 faults have a diagnostic rate of more than 50%, especially faults 3 and 9 are 20% better than the other models (except literature [30]). Compared with a DBN-based fault diagnosis model proposed in [30], the FDRs of the 21 faults have no much difference, so the average FDR is only 1.23% higher. It should be noted that fault 15 has a relatively poor diagnostic effect and needs to be further investigated.

To thoroughly evaluate the quality of the proposed method, score is selected as the evaluation indicator. It is a classical index in machine learning field [38], which analyses classifiers based on recall and precision and is calculated by their harmonic means.

From the general confusion matrix in Table 3, the formulas for calculating recall and precision are as follows:

Thus, the score can be calculated as follows:

The score of WCForest is shown in Table 8, reflecting the diagnostic ability of the model. The values of recall and precision almost achieve 100% on faults 1, 2, 6, 7, and 21, which means the great performance of true positive rate and false positive values. Finally, Figure 9 shows the recall and precision of WCForest which indicates the proposed method has good performance.

4.4. Hierarchical Representation Learning Visualization

In order to understand the characterization process of WCForest and the hierarchical representation of its learning process, it is very important to observe the diagnostic results of each layer intuitively. As features learned are high dimensional, the diagnostic results of each layer are difficult to be visualized. To address this problem, we use t-distributed stochastic neighbour embedding (t-SNE) [39] as a tool to visualize the hierarchical representation learning process of the WCForest model.

The t-SNE method is a variant of stochastic neighbour embedding (SNE) [40]. It uses symmetric SNE to replace conditional probability with joint probability between data points in high-dimensional space and low-dimensional space. Meanwhile, Gauss probability distribution is used in high-dimensional space and distribution with 1 freedom degree is used in low-dimensional space, which solves the problem of data point congestion in SNE. Therefore, t-SNE can better express the complex nonlinear relationship between high-dimensional data in the process of dimension reduction.

We use t-SNE method to embed high-dimensional features of each layer into two-dimensional or three-dimensional space, which can be visualized in scatter plots. The feature learning process can be easily visualized by using the 2D or 3D maps corresponding to each layer. Through experiments, we found that 3D maps are not suitable for visualization of WCForest-based fault diagnosis models. Therefore, the high-dimensional output features of each layer are embedded in the 2D map and then plotted in the subgraph of Figure 10.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

600 samples of 22 classes (one normal and 21 faults) were randomly selected from testing set for visualization. The size of input data is , and then these 52-dimensional vectors are transformed into 600 vectors of 2-dimensional by using the t-SNE method. In each subgraph of Figure 10, these points are marked with their actual class labels, “Normal” with “0” and “Fault 01” with “1,” and so on. In addition, in order to distinguish clusters for viewing, different colors are used to represent their classes. The output of each layer is converted to a vector of 2-dimensional by using t-SNE, so that it can be visualized in 2D (Figures 10(b)–10(h)).

As shown in Figure 10(a), the raw process data samples of all classes are mixed. Distribution of feature samples of multigrained scanning is shown in Figure 10(b). Then, by learning the representation of weighted cascaded forests at different levels, we can find that the samples are clustered gradually through class labels in t-SNE mapping (see Figures 10(c)–10(f)). This indicates that the nonlinear expression ability of the WCForest model increases with the increase of the number of layers. WCForest maps the indivisible features to the nonlinear separable space by deepening the number of layers in the cascade forest. It also verifies the rationality of the WCForest model to deepen the design of the forest layer. Finally, these subgraphs strongly prove that the WCForest model is effective for fault diagnosis tasks.

4.5. Model Performance

Is the number of training samples crucial for obtaining good diagnosis performance? To answer the question, we compared the average accuracy for training and testing on different training datasets with 10560, 8800, 6600, 4400, and 2200 training samples and 600 testing samples, and demonstrated the result in Figure 11. The figure shows that the test accuracy of the WCForest model is greatly affected by the number of training samples, especially in the early stage. Although the late rise is relatively low, it has continued to increase, while the train accuracy is not greatly affected by the number of training samples.

5. Conclusions

In this paper, an improved deep forest model, WCForest, is proposed for fault diagnosis of chemical processes to improve accuracy, reduce false alarm rate, and process high-dimensional and nonlinear data. The main performance is that, without increasing the computational complexity, -fold cross-validation is used to calculate the weight of each forest in the cascade structure in order to boost the good performance of forests and weaken the bad ones, so as to improve the overall performance of the cascade random forest.

To show the performance of the proposed model, RF, XGBoost, AdaBoost, gcForest, and WCForest were applied to the benchmark TE process, containing 16 known faults and 5 unknown faults for testing. The WCForest model predicts an average FDR of 84.13% and a FPR of 2.45%, with a high accuracy and a low false positive rate, which is comparable to the average diagnostic rate reported in other literatures. To provide more information about the performance of the model, the score is also chosen as an evaluation measurement for the integrity and purity of the classifier. Our work shows the validity and efficiency of WCForest, which can predict fault diagnosis in the TE process and can provide a reference for other chemical processes. In addition, most data samples are clearly and correctly clustered by WCForest in the t-SNE map.

Because of its excellent fault diagnosis rate and false positive rate, this method has industrial prospects. The data-driven fault diagnosis methods depend on the collection of a large amount of various process malfunction samples. Inevitably, our WCForest-based fault diagnosis model suffers from the same drawback. In the near future, research also will be focused on fault diagnosis with limited number of fault samples available.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The paper was supported by grants from the National Natural Science Foundation of China (NSFC) (61562054).

References

S. Rajaraman, J. Hahn, and M. S. Mannan, “A methodology for fault detection, isolation, and identification for nonlinear processes with parametric uncertainties,” Industrial & Engineering Chemistry Research, vol. 43, no. 21, pp. 6774–6786, 2004.
View at: Publisher Site | Google Scholar
H. Sun, S. Zhang, C. Zhao, and F. Gao, “A sparse reconstruction strategy for online fault diagnosis in nonstationary processes with no a priori fault information,” Industrial & Engineering Chemistry Research, vol. 56, no. 24, 2017.
View at: Publisher Site | Google Scholar
V. Venkatasubramanian, R. Rengaswamy, K. Yin, and S. N. Kavuri, “A review of process fault detection and diagnosis,” Computers & Chemical Engineering, vol. 27, no. 3, pp. 293–311, 2003.
View at: Publisher Site | Google Scholar
Q. Jiang, X. Yan, and B. Huang, “Review and perspectives of data-driven distributed monitoring for industrial plant-wide processes,” Industrial & Engineering Chemistry Research, vol. 58, no. 29, pp. 12899–12912, 2019.
View at: Publisher Site | Google Scholar
M. Alauddin, F. Khan, I. Syed, and S. Ahmed, “A bibliometric review and analysis of data-driven fault detection and diagnosis methods for process systems,” Industrial and Engineering Chemistry Research, vol. 57, pp. 10719–10735, 2018.
View at: Publisher Site | Google Scholar
M. Misra, H. H. Yue, S. J. Qin, and C. Ling, “Multivariate process monitoring and fault diagnosis by multi-scale PCA,” Computers & Chemical Engineering, vol. 26, no. 9, pp. 1281–1293, 2002.
View at: Publisher Site | Google Scholar
Q. Jiang, X. Yan, and J. Li, “PCA-ICA integrated with bayesian method for non-gaussian fault diagnosis,” Industrial & Engineering Chemistry Research, vol. 55, no. 17, pp. 4979–4986, 2016.
View at: Publisher Site | Google Scholar
Y. Zhang and C. Ma, “Fault diagnosis of nonlinear processes using multiscale KPCA and multiscale KPLS,” Chemical Engineering Science, vol. 66, no. 1, pp. 64–72, 2011.
View at: Publisher Site | Google Scholar
S. Yin, X. Zhu, and O. Kaynak, “Improved PLS focused on key-performance-indicator-related fault diagnosis,” IEEE Transactions on Industrial Electronics, vol. 62, pp. 1651–1658, 2015.
View at: Publisher Site | Google Scholar
J.-M. Lee, J. Qin, and I.-B. Lee, “Fault detection and diagnosis of multivariate process based on modified independent component analysis,” AIChE Journal, vol. 52, no. 10, pp. 3501–3514, 2006.
View at: Google Scholar
Y. Zhang and C. Ma, “Decentralized fault diagnosis using multiblock kernel independent component analysis,” Chemical Engineering Research and Design, vol. 90, no. 5, pp. 667–676, 2012.
View at: Publisher Site | Google Scholar
J. Yu, “Localized Fisher discriminant analysis based complex chemical process monitoring,” AIChE Journal, vol. 57, no. 7, pp. 1817–1828, 2011.
View at: Publisher Site | Google Scholar
Z.-B. Zhu and Z.-H. Song, “Fault diagnosis based on imbalance modified kernel Fisher discriminant analysis,” Chemical Engineering Research and Design, vol. 88, no. 8, pp. 936–951, 2010.
View at: Publisher Site | Google Scholar
Z. Chai and C. Zhao, “Enhanced random forest with concurrent analysis of static and dynamic nodes for industrial fault classification,” IEEE Transactions on Industrial Informatics, vol. 16, no. 1, pp. 54–66, 2019.
View at: Publisher Site | Google Scholar
Q. Jiang, F. Gao, H. Yi, and X. Yan, “Multivariate statistical monitoring of key operation units of batch processes based on time-slice CCA,” IEEE Transactions on Control Systems Technology, vol. 27, no. 3, pp. 1368–1375, 2019.
View at: Publisher Site | Google Scholar
W. Yu and C. Zhao, “Sparse exponential discriminant analysis and its application to fault diagnosis,” IEEE Transactions on Industrial Electronics, vol. 65, no. 7, pp. 5931–5940, 2018.
View at: Publisher Site | Google Scholar
C. Zhao, W. Wang, and F. Gao, “Probabilistic fault diagnosis based on Monte Carlo and nested-loop Fisher discriminant analysis for industrial processes,” Industrial & Engineering Chemistry Research, vol. 55, no. 50, pp. 12896–129081, 2016.
View at: Publisher Site | Google Scholar
J. Wang, B. Zhong, and J. Zhou, “Quality-relevant fault monitoring based on locality preserving partial least squares statistical models,” Industrial & Engineering Chemistry Research, vol. 56, no. 24, pp. 7009–7020, 2017.
View at: Publisher Site | Google Scholar
M. Galiaskarov, V. V. Kurkina, and L. A. Rusinov, “Online diagnostics of time-varying nonlinear chemical processes using moving window kernel principal component analysis and Fisher discriminant analysis: diagnostics of time-varying nonlinear chemical processes,” Journal of Chemometrics, vol. 31, no. 8, Article ID e2866, 2017.
View at: Publisher Site | Google Scholar
S. He, X. Liu, Y. Wang et al., “An effective fault diagnosis approach based on optimal weighted least squares support vector machine,” The Canadian Journal of Chemical Engineering, vol. 95, no. 12, pp. 2357–2366, 2017.
View at: Publisher Site | Google Scholar
Y. Xu, S.-Q. Shen, Y.-L. He, and Q.-X. Zhu, “A novel hybrid method integrating ICA-PCA with relevant vector machine for multivariate process monitoring,” IEEE Transactions on Control Systems Technology, vol. 27, no. 4, pp. 1780–1787, 2019.
View at: Publisher Site | Google Scholar
H. Chen, B. Jiang, N. Lu, and Z. Mao, “Deep PCA based real-time incipient fault detection and diagnosis methodology for electrical drive in high-speed trains,” IEEE Transactions on Vehicular Technology, vol. 67, no. 6, pp. 4819–4830, 2018.
View at: Publisher Site | Google Scholar
M. Basseville and I. Nikiforov, Detection of Abrupt Change Theory and Application, vol. 15, Prentice Hall, Upper Saddle River, NJ, USA, 1993.
J. Chen and A. K. Gupta, Parametric Statistical Change Point Analysis: with Applications to Genetics, Medicine, and Finance, Springer, Cambridge, MA, USA, 2012.
L. Wen, X. Li, L. Gao, and Y. Zhang, “A new convolutional neural network based data-driven fault diagnosis method,” IEEE Transactions on Industrial Electronics, vol. 65, no. 7, pp. 5990–5998, 2017.
View at: Publisher Site | Google Scholar
G. Jiang, H. He, J. Yan, and P. Xie, “Multiscale convolutional neural networks for fault diagnosis of wind turbine gearbox,” IEEE Transactions on Industrial Electronics, vol. 66, no. 4, pp. 3196–3207, 2018.
View at: Publisher Site | Google Scholar
Z. Chen, S. Deng, X. Chen, C. Li, R.-V. Sanchez, and H. Qin, “Deep neural networks-based rolling bearing fault diagnosis,” Microelectronics Reliability, vol. 75, pp. 327–333, 2017.
View at: Publisher Site | Google Scholar
K. B. Lee, S. Cheon, and C. O. Kim, “A convolutional neural network for fault classification and diagnosis in semiconductor manufacturing processes,” IEEE Transactions on Semiconductor Manufacturing, vol. 30, pp. 135–142, 2017.
View at: Publisher Site | Google Scholar
D. Xie and L. Bai, “A hierarchical deep neural network for fault diagnosis on Tennessee-Eastman process,” in Proceedings of the IEEE 14th International Conference on Machine Learning and Applications (ICMLA), pp. 745–748, IEEE, Miami, FL, USA, December 2015.
View at: Publisher Site | Google Scholar
Z. Zhang and J. Zhao, “A deep belief network based fault diagnosis model for complex chemical processes,” Computers & Chemical Engineering, vol. 107, pp. 395–407, 2017.
View at: Publisher Site | Google Scholar
H. Wu and J. Zhao, “Deep convolutional neural network model based chemical process fault diagnosis,” Computers & Chemical Engineering, vol. 115, pp. 185–197, 2018.
View at: Publisher Site | Google Scholar
Z.-H. Zhou and Ji Feng, “Deep forest: towards an alternative to deep neural networks,” in Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, Melbourne, Australia, August 2017.
View at: Publisher Site | Google Scholar
J. J. Downs and E. F. Vogel, “A plant-wide industrial process control problem,” Computers & Chemical Engineering, vol. 17, no. 3, pp. 245–255, 1993.
View at: Publisher Site | Google Scholar
K. Ghosh, M. Ramteke, and R. Srinivasan, “Optimal variable selection for effective statistical process monitoring,” Computers & Chemical Engineering, vol. 60, pp. 260–276, 2014.
View at: Publisher Site | Google Scholar
M. Ali Ayubi Rad and M. Javad Yazdanpanah, “Designing supervised local neural network classifiers based on em clustering for fault diagnosis of Tennessee Eastman process,” Chemometrics and Intelligent Laboratory Systems, vol. 146, pp. 149–157, 2015.
View at: Publisher Site | Google Scholar
Q. Jiang and B. Huang, “Distributed monitoring for large-scale processes based on multivariate statistical analysis and bayesian method,” Journal of Process Control, vol. 46, pp. 75–83, 2016.
View at: Publisher Site | Google Scholar
C. Zhang, X. Gao, T. Xu, Li Yuan, and Y. Pang, “Fault detection and diagnosis strategy based on a weighted and combined index in the residual subspace associated with PCA: fault detection and diagnosis based on PCA-RS2 in residual subspace,” Journal of Chemometrics, vol. 32, no. 11, Article ID e2981, 2018.
View at: Publisher Site | Google Scholar
G. Ning, S. Nakajima, and M. Pantel, “Online diagnosis of accidental faults for real-time embedded systems using a hidden markov model,” Simulation, vol. 91, no. 10, pp. 851–868, 2015.
View at: Publisher Site | Google Scholar
L. van der Maaten and G. Hinton, “Visualizing data using t-SNE,” Journal of Machine Learning Research, vol. 9, pp. 2579–2605, 2008.
View at: Google Scholar
G. E. Hinton and S. T. Roweis, “Stochastic neighbor embedding,” in Advances in Neural Information Processing Systems, S. Becker, S. Thrun, and K. Obermayer, Eds., vol. 15, pp. 857–864, MIT Press, Cambridge, MA, USA, 2003.
View at: Google Scholar

Copyright

Copyright © 2020 Jiaman Ding et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies