Abstract

Due to the complexity of the structure and process of large-scale petrochemical equipment, different fault characteristics are mixed and present multiple couplings and ambiguities, leading to the difficulty in identifying composite faults in rotating machinery. This paper proposes a composite faults diagnosis method for rotating machinery of the large unit based on evidence theory and multi-information fusion. The evidence theory and multi-information fusion method mainly deal with multisource information and conflict information, synthesize multiple uncertain information, and obtain synthetic information from multiple data sources. To detect faults in rotating machinery, the dimensionless index ranges of composite faults are first used to form a feature set as the reference. Then, a two-sample distribution test is applied to compare the known fault samples with the tested fault samples, and the maximum statistical distance is used. Finally, the multiple maximum statistical distances are fused by evidence theory and identifying fault types based on the fusion result. The proposed method was applied to the large petrochemical unit simulation experiment system, the results of which showed that our proposed method could accurately identify composite faults and provide maintenance guidance for composite fault diagnosis.

1. Introduction

Rotating machinery works in complex environments and has difficulty in separating the signal of faults in industrial petrochemical plants, thereby complicating the fault diagnosis decision [13]. Once the large unit presents problems, it needs to be entirely stopped for inspection, which can result in a huge economic loss. Therefore, it is essential to quickly identify the fault signal and predict the fault types.

For the fault diagnosis problem of rotating machinery, researchers have presented solutions including the dimensionless algorithms [46], neural networks [79], classification method [1014], and evidence theory [1517]. Among these, the dimensionless algorithm is insensitive to signal disturbance, amplitude changes, and stable frequency signals. Therefore, it has been widely used in rotating machinery fault diagnosis [18, 19]. In [20], three EEMD-based three dimensionless indexes were proposed to characterize the railway axle bearing steadiness states and detect the different defects. Wu et al. [21] proposed a method of Hilbert–Huang transform and instantaneous dimensionless frequency normalization and applied them for the gearbox system. However, there is a certain overlap between the normal state and a range of various fault states using this method. In other words, the scope of the dimensionless indexes of normal equipment and fault equipment is difficult to distinguish, which makes the decision more difficult. To solve these problems, Xiong et al. [22] proposed a genetic programming method based on dimensionless indexes in the time domain, which has achieved positive results in rotating machinery classification. However, constructing new dimensionless indexes with this method presents many deficiencies. For instance, the operator set and the termination character set affect the complexity and convergence of the program. Therefore, when the search scope is expanded, it is easy to lose potential useful fault information. All of these variables can affect fault diagnosis efficiency. To solve the problem of information loss due to the reduction and clustering classification during the fault feature information generation, Dempster [23] proposed an integrated fault diagnosis method of the dimensionless indexes immune detector. In [24, 25], a small sample method was proposed, but the method had an overfitting problem. In [26, 27], a fault diagnosis method of induction motor based on sparse noise reduction self-encoder was proposed, and it reduced the risk of network overfitting in small samples.

The theory evidence is an uncertainty theory [2831], the major characteristics of which are measuring and addressing various kinds of uncertain information and using the synthetic principle to obtain multi-information entropy. In this way, the theory evidence can process multi-information and conflicting information better and thus has been widely used in areas such as information fusion and uncertain reasoning. In [32, 33], the advantages of the evidence theory to propose a fault diagnosis method based on multisensor information fusion were taken, which improved the fault diagnosis accuracy rate. Xiao [34] proposed an evidence theory and fuzzy preference methods to handle the conflicting evidence combination problem in a multisensor environment; it does not consider cracks and misalignment. In [35, 36], a multiparameter comprehensive diagnosis system model was proposed, and the difficult problem of distinguishing various faults in the same symptom domain in the field of fault diagnosis was solved. Song and Jiang [37] proposed a new evidential fault diagnosis method in which multiple hypotheses are taken into consideration.

Evidence theory and multi-information fusion are mainly aimed at multi-information fusion; it is unreasonable to install multisensors on the petrochemical equipment unit. Therefore, we can improve the accuracy of fault diagnosis. In this paper, we propose a method for faults diagnosis based on dimensionless indexes, multi-information fusion, and two-sample distribution test. The author believes the reasons that lead to faults of large units are difficult to identify, mainly due to the existence of multi-information and conflict information. The multi-information fusion method can fuse multiple uncertainty probabilistic information to determine the potential faults of samples. The main contributions in this paper are concluded as follows:(i)Two-sample distribution of the known fault samples is compared with the tested fault samples, and the maximum statistical distance(ii)Multi-dimensionless information is fused, and the fault types can be identified according to fused results(iii)The method we proposed is verified with a large petrochemical unit simulation experiment system and is shown to effectively improve fault identification

The rest of this paper is organized as follows: Section 2 describes the process of fault diagnosis and theoretical basis: dimensionless algorithm, two-sample distribution test, and multi-information fusion. Section 3 verifies the reliability of the proposed method. In Section 4, the conclusion of this paper is presented.

2. Proposed Method

In this section, the dimensionless algorithm is first briefly introduced. Then, the two-sample distribution test is compared with the cumulative probability distribution function of the known fault samples and the tested fault samples, and the maximum statistical distance is obtained. Then, the multisimilarity is fused. Finally, the process of fault identification is conducted. The diagram of composite fault diagnosis for rotating machinery of a large unit based on evidence theory and multi-information fusion is shown in Figure 1.

2.1. Dimensionless Algorithm

Dimensionless indexes refer to the ratio of two quantities with the same dimensions. The basic idea behind the dimensionless algorithm is to achieve the eliminating dimension of the two-dimensional ratios that are based on the probability density function, so the dimensionless indexes are not affected by the frequency and amplitude of the mechanical signal in the fault diagnosis [38, 39]. The dimensionless algorithm is defined as follows:where is the amplitude of the vibration time-domain signal, is the probability density function, and and are the molecular and denominator coefficients, respectively.

The five dimensionless indexes, waveform index, peak index, impulse index, margin index, and kurtosis index, are shown in Table 1 [40].

Assume that the vibration time-domain signal of the known fault sample is and the amplitude of the tested fault sample is .

The dimensionless indexes of the known fault samples are . The dimensionless indexes of the tested fault samples is . Among them, and are the minimum and maximum of the dimensionless indexes of the known fault samples, and and are the minimum and maximum of the dimensionless indexes of the tested fault samples.

2.2. Two-Sample Distribution Test

The probability density of the nonstationary random signal is , and the probability density function at time is defined [41] aswhere is the probability, which reflects the probability that the signal falls within different amplitude intensity regions. For each state of the process, the probability function can be described aswhere represents the length of the sample and represents the signals between and .

The probability that the value of is less than or equal to the probability distribution density function of the signal is represented by :

The random variable and its cumulative probability distribution function can be represented by :

is used to represent the cumulative probability distribution function of random sample observations with sample size .

Assume that the cumulative probability distribution function of the known fault samples is and the tested fault samples is . By comparing the cumulative probability distribution function of the known fault samples with the tested fault samples, the statistical distance of each cumulative frequency is obtained [42]:

After comparing with , the statistical distance is , and the specific value can be described:

Assume that the cumulative probability distribution function of the known fault samples is and the cumulative probability distribution function of the tested fault samples is . The maximum statistical distance, value, between them is obtained by comparison:

After comparing with , the value can be described:

If , :

A small probability can be described as

For each value, if , it means that the difference between the and is too large, and negative obeys the distribution hypothesis of the known fault samples. It indicates that the distribution function of the known fault samples has a high degree of fitting with the tested fault sample distribution function.

2.3. Multi-Information Fusion

Assume is all the discrete sets of and each element in , denoted as , is independent and mutually exclusive; , , and . is the basic probability number of , indicating the exact trust in .

For any assumption, its trust degree is defined as the sum of the basic probabilities corresponding to all subsets of , that is, . is defined as follows:

The function, called the lower bound function, represents the full trust in .

For any assumption, the likelihood function , which is the sum of the basic probabilities corresponding to all subsets of , is as follows:where functions are called upper-bound functions, which represent the degree of trust in .

The relationship between the trust function and the likelihood function is obtained from formulas (12) and (13) as follows: . The uncertainty of is expressed as follows: . Then, ( ) is called the trust interval. The trust interval and fitting range are shown in Figure 2.

Assuming that and are two probability assignment functions on , is orthogonal to , and is defined aswhere .

According to formula (14), the orthogonal sum of multiple probability assignment functions is defined as follows:where .

2.4. Fault Identification Process

The process of evidence theory and multi-information fusion is described below:

Step 1. The vibration time-domain signal of the rotating machinery is collected. The amplitude of vibration time-domain signal for the known fault sample is and for the tested fault sample is .

Step 2. The dimensionless algorithm is able to process the vibration time-domain signal, to obtain five dimensionless indexes, according to equation (1) and Table 1. The dimensionless index of the known fault samples is , and that of the tested fault samples is .

Step 3. According to A and B in step 2, the cumulative frequency of the known fault samples is , and that of the tested fault samples is .

Step 4. We compared the cumulative probability distribution of the known samples with the tested fault samples and then obtained the maximum statistical distance according to Equations (6)–(9).

Step 5. The evidence theory and multi-information fusion method are used to fuse the maximum statistical distance and obtain the fusion result according to equations (12)–(15).

3. Validation Experiment

In this section, in order to verify the effectiveness of the proposed approach, the composite fault diagnosis of rotating machinery is studied and verified by using the large petrochemical unit simulation experiment system.

3.1. Data Acquisition and Processing

The simulation experiment system consists of the multistage centrifugal air compressor unit, various test stations, and test software. The acquisition software can display the time-domain waveform of the vibration time-domain signal in real time, extracting signal characteristics while storing historical data and monitoring the operation of the unit. Large petrochemical unit simulation experiment system is shown in Figure 3, and the model and parameters of its main components are shown in Table 2.

In the experiments, the EMT390 data collector is used to collect the vibration time-domain signal of the composite faults and a discrete 1024-point set of data. The collection of 100 groups of vibration time-domain signal data is done under each fault condition. The first 50 groups are known as fault samples, and the latter 50 groups are the tested fault samples. The test conditions are as follows: motor speed is 1000 r/min and motor rated power is 11 KW. The sensor is placed next to the gearbox as shown in Figure 3(b).

According to the laboratory conditions, six different faults were combined, and four fault conditions of rotating machinery are shown in Figure 4.

Vibration time-domain signal acquisition and processing:(1)In this paper, the sampling frequency (rate) is 1024 Hz, which means 1024 points are collected per second and a total of 400 seconds is collected. Acquisition of different faults vibration time-domain signals is shown in Figure 5.(2)The vibration time-domain signal of the rotating machinery is processed according to the dimensionless algorithm, and the range of five dimensionless indexes was obtained, as shown in Table 3.

3.2. Discussion of Experimental Results
3.2.1. Two-Sample Distribution Test

Figure 6 shows the different faults. The two-sample distribution test tests five dimensionless indexes including the junction point of waveform index, impulse index, margin index, peak index, and kurtosis index of the maximum statistical distance k value. Each color shown in Figure 6, represents different faults.

The distribution of five k values for different faults with different samples can be seen in Figure 6. The two-sample distribution test identifies the fault types based on the five k values between the dimensionless indexes. The k value of the same fault samples is small relative to the other different fault samples, indicating that the two samples belong to the same distribution. Different fault samples k value is larger than the same fault, indicating that the two samples belong to different distributions which are described below:Sample 1. The four fold lines overlap each other, and the fault types of the sample were not recognized.Sample 2. Gear missing teeth and bearing outer ring broken line at the bottom, and this failure’s five k values are smaller than others’. The sample failures are gears teeth missing and bearing outer ring.Sample 3. It can be seen from the figure that the five k values of the gears teeth missing and bearing outer ring are the largest, and the fault was ruled out. However, the other three fold lines overlap each other, and the fault samples cannot be identified.Sample 4. The results show that the four broken lines overlap each other without significant difference. In this situation, the fault cannot be identified.Sample 5. From Figure 6(e), you can see three faults at the same time. It still overlaps, and the faults cannot be identified.Sample 6. From the distribution of the fold line, it is possible to identify that the samples have large and small gear teeth missing and bearing inner ring wear.

There are no obvious change rules in the five k values of sample 1, sample 3, sample 4, and sample 5 in the fault samples. The main cause of this situation is multisource information and conflict information between the information. Although the change rules of Sample 2 and Sample 6 are obvious, according to the principle of the same fault minimum k value, the fault types can be identified: gear teeth missing and bearing outer ring wear; large and small gear teeth missing and bearing inner ring wear. Nevertheless, the accuracy of the identification in the six groups of the tested fault samples was only 33.33%.

3.2.2. Evidence Theory and Multiple Information Fusion Results

The efficiency of the proposed composite fault diagnosis method based on evidence theory and multi-information fusion for a large petrochemical unit is verified, and the results are presented in Figure 7. Each color represents the probability of different faults occurring in the same sample. The five k values through the evidence theory and multi-information fusion are different, which is described below:

Sample 1. As the height of the column is different, the probability of each fault is different. Although in the sample 1, the probability of the two faults is similar, the faults can be identified by the maximum probability. The faults which occurred are gear teeth missing and bearing inner ring wear.Samples 2, 4, and 6. Three kinds samples can be seen from the column charts, and their probabilities have reached 99.91%, 100%, and 99.92%. The method is able to identify the fault types separately: gears teeth missing and bearing outer ring, gears teeth missing and bearing lack of ball bearing, and large and small gear teeth missing and bearing inner ring wear.Samples 3 and 5. Fault types were accurately identified by comparing the highest column chart for each fault. The faults that occurred in samples include large and small gear teeth missing, large and small gear teeth missing, and bearing outer ring wear.

By comparing the probabilities of different faults, the fault types are identified based on the maximum probability. The probability occurrence of faults is given in Figure 7, and the maximum probability is able to identify the fault types. In the six groups of test faults samples, the accuracy rate reached 100%, and the accuracy of relative two-sample distribution test was improved by 66.67%. The experimental results demonstrated that the proposed method is able to identify the fault types accurately.

4. Conclusion

This paper proposes a composite fault diagnosis method of large unit rotating machinery based on evidence theory and multi-information fusion. It effectively solves the problem of multisource information and conflict information between information and improves the accuracy of composite fault diagnosis.

The results of the simulation experiment showed that the maximum probability of occurrence can accurately predict the composite faults types by using evidence theory and multi-information fusion method. In the future work, we would further study the robustness and applicability of our proposed method in the real applications by considering more rotating machinery working conditions and noise levels.

Data Availability

The vibration time-domain signal data used to support the findings of this study are currently under embargo, while the research findings are commercialized. Requests for data, 6/12 months after publication of this article, will be considered by the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Nos. 61673127 and 61473331) and the Science and Technology Planning Project of Guangdong Province (Nos. 2015A030401103 and 2017A070712024).