Abstract
Variational mode decomposition (VMD) method has been widely used in the field of signal processing with significant advantages over other decomposition methods in eliminating modal aliasing and noise robustness. The number (usually denoted by K) of intrinsic mode function (IMF) has a great influence on decomposition results. When dealing with signals including complex components, it is usually impossible for the existing methods to obtain correct results and also effective methods for determining K value are lacking. A method called center frequency statistical analysis (CFSA) is proposed in this paper to determine K value. CFSA method can obtain K value accurately based on center frequency histogram. To shed further light on its performance, we analyze the behavior of CFSA method with simulation signal in the presence of variable components amplitude, components frequency, and components number as well as noise amplitude. The normal and fault vibration signals obtained from a bearing experimental setup are used to verify the method. Compared with maximum center frequency observation (MCFO), correlation coefficient (CC), and normalized mutual information (NMI) methods, CFSA is more robust and accurate, and the center frequencies results are consistent with the main frequencies in FFT spectrum.
1. Introduction
Variational mode decomposition (VMD) has been used in various applications since it was proposed by Dragomiretskiy and Zosso [1]. VMD uses completely nonrecursive decomposition method to get intrinsic mode functions (IMF), which has a stronger antinoise ability than empirical mode decomposition (EMD) method. The number of IMF by EMD cannot be artificially set, while VMD can do so. The low-frequency component of VMD is easier to express the general fluctuation trend of the original signal, but it is difficult for EMD to observe this characteristic. In addition, VMD has great advantages in processing modal aliasing and high noise signal compared with EMD, ensemble empirical mode decomposition (EEMD) and empirical wavelet transform (EWT) [2, 3].
When VMD is used to decompose a signal, there are six parameters () that need to be determined. Here, represents penalty factor, represents fidelity coefficient, represents number of IMF, represents the updating parameter of first center frequency, represents the initialization parameter of center frequency, and represents convergence threshold. Normally, the parameters can be assigned to the default value as , . However, the parameters and need to be optimized according to the actual signal. The greater the value of is, the faster the attenuation is on both sides of the center frequency. is mainly determined according to the principle of avoiding aliasing between mode functions and is generally 1/6∼2 times of sampling frequency Its specific value needs to be further determined according to the characteristics of the signal [4]. There is no unified criterion or method to determine the value of K so far [5]. Many researches have been done in recent years. A large number of methods have been proposed. At present, determination methods of IMF number in VMD can be divided into three categories: (1) methods based on center frequency observation, (2) methods based on threshold criteria, and (3) other methods.(1)The first type of method for K value determination is to observe the center frequency or the IMF spectrum distribution. Liu [6] proposed maximum central frequency observation (MCFO) method for determining the value of K. When there are two center frequency sets similar to each other, the value of K is the best. Zheng et al. [7, 8] determined the K value according to the spectrum distribution of IMF components. The principle is that the spectrum of each IMF neither overlaps nor loses frequency information when the suitable K value is selected. This kind of method is practicable to some extent but lacks quantitative analysis.(2)The second type of method of K value determination is threshold criterion, such as correlation coefficient, mutual information, kurtosis, and distance measurement. In [9], the K value is determined by whether the correlation coefficient (CC) between the reconstructed signal and the original signal reaches the threshold or not. Zhang et al. [10] proposed the correlation and energy ratio to determine the number of IMF. Liu et al. [11] used the normalized mutual information (NMI) between the IMF and the original signal to judge whether the K value is appropriate or not. Wang et al. [12] determined K based on the number of IMF whose permutation entropy is greater than the threshold. Huang et al. [13] determined K based on proposed normalized distance (ND) indicator, which is used to describe the similarity degree between the reconstructed signal and original signal. The threshold of this kind of methods is difficult to determine and has strong subjectivity. Zhao and Li [14] used kurtosis to select sensitive IMFs. Kurtosis criterion is only suitable for processing signals with periodic big shock components. However, it is not suitable for the processing of signals without impact or signals with weak impact.(3)There are other ways in the third type to determine the value of K. Li et al. [15] proposed a method in which the value of K is equal to the number of IMF of EMD on the same signal. Liu et al. [16] selected the number of K based on detrended fluctuation analysis (DFA). Feng et al. [17] proposed a method to determine the IMF number, which is based on sampling frequency divided by two times meshing frequency in a gearbox. Zhang et al. [18] used the number of peaks in the envelope spectrum of product function (PF) components from local mean decomposition (LMD) to determine the value of K. Wang et al. [19] took the minimum average envelope entropy as the fitness function and used the particle swarm optimization algorithm to optimize the parameter K and penalty factor. The algorithm of this kind of method is more complex.
In order to determine the IMF number simply and accurately while decomposing a complex signal by VMD, CFSA method is proposed in this work. The rest of the paper is organized as follows: Section 2 introduces the algorithm of VMD, MCFO, CC, NMI, and CFSA method. In Section 3, the classic methods and CFSA method are studied by constructed simulation signals. In Section 4, the classic methods and CFSA method are applied to rolling bearing experimental signals processing. Conclusions are drawn in Section 5.
2. Theoretical Introduction
2.1. Variational Mode Decomposition
VMD can nonrecursively decompose a real-valued multicomponent signal into a discrete number of quasi-orthogonal band-limited subsignals with specific sparsity properties in the spectral domain. Each mode is compact around a center pulsation and its bandwidth is estimated using Gaussian smoothness of the shifted signal. The VMD is written as a constrained variational problem:where and are the intrinsic mode function and its center frequency, respectively.
Equation (1) can be solved by introducing a quadratic penalty and Lagrangian multipliers. The augmented Lagrangian is given as follows:where denotes the balancing parameter of the data-fidelity constraint.
Equation (2) is then solved with the alternate direction method of multipliers (ADMM). All the modes gained from solutions in spectral domain are written aswhere ωk is a frequency corresponding to the center of gravity of power spectrum of the corresponding IMF. Thus, Wiener filtering is embedded in the VMD algorithm that makes it much more robust to sampling and noise. The update equation for the center frequency is expressed as
Complete algorithm of the VMD can be found in detail in [1]. The purpose of VMD for signals is to obtain several IMFs. How to determine the number of IMFs is crucial for the decomposition results. Next, several representative methods for determining the number of IMFs are introduced, and a novel method is proposed through research.
2.2. Traditional Methods to Determine K
2.2.1. Maximum Center Frequency Observation Method
The principle of maximum center frequency observation (MCFO) method is to observe the trend of maximum center frequencies. Maximum center frequency of each mode component is increasing gradually with increment of the mode number. When maximum center frequencies tend to be stable, the value of K can be determined.
2.2.2. Correlation Coefficient Method
Correlation coefficient (CC) between the mode components and the original signal can be obtained by (5). Decomposition will be stopped when the minimum correlation coefficient is less than the given threshold, and then the value of K can be determined:where represents the CC between the IMF and original signal; and represent the original signal and IMF obtained by VMD; corresponds to the mathematical expectation.
2.2.3. Normalized Mutual Information Method
Mutual information (MI) value of the IMF and the original signal is calculated by (6). Decomposition will be stopped when the minimum value of NMI is less than the given threshold, and thus the value of K can be determined:where represents information entropy of ; represents the conditional entropy of when is known. The stronger the correlation between and , the smaller the conditional entropy value , and the larger the mutual information . Normalized mutual information (NMI) is expressed aswhere represents the serial number of the IMF.
2.3. Center Frequency Statistical Analysis
Center frequency statistical analysis (CFSA) method is proposed in this work. Its main idea is to observe the number of frequencies higher than the mean value in the center frequency histogram, which is considered as K value. The detailed steps are as follows. (1) Initialize VMD parameters. (2) Decompose the signal by VMD to obtain IMFs. (3) Calculate the center frequency of IMFs using FFT. (4) Draw center frequency histogram. (5) Calculate average of center frequency and count number (N) of center frequencies that are higher than their mean value. K value plus 1, return to step (2). When the N value no longer increases, the decomposition is stopped, and the N value is the best number of IMF. The flowchart of CFSA method is shown in Figure 1.

3. Simulation Studies
In this section, the four factors that influence were studied by simulation analysis. The four factors are components amplitude, components frequency, and components number and noise amplitude.
3.1. Simulation Signal
The simulation signal is synthesized by several sinusoidal signals with the following expression:where are six sinusoidal signal components, are the amplitude of the corresponding components, respectively. are the frequencies of the components, respectively. is the noise signal. is the amplitude of the noise signal, and is Gaussian white noise. Simulation signals under different influencing factors are shown in Table 1.
3.2. Simulation Results Analysis
3.2.1. Influence of Component Amplitude
Signals S1 to S4 were analyzed by methods mentioned above in this section. VMD parameters except K are initialized as follows: .
According to MCFO method, the maximum center frequency trend under different component amplitudes is shown in Figure 2. As can be seen, the maximum center frequency gradually stabilizes after the value of K becomes bigger than 6. Thus, six IMFs were obtained by the MCFO. However, the simulation signals only contain three components, and the decomposition result is inconsistent with the actual components number. So the method is not accurate in acquiring the IMF number of VMD.

According to CC method, correlation coefficients under different component amplitudes are shown in Table 2. For a more intuitive analysis of the results, threshold ranges are plotted in Figure 3. It can be seen that when the component amplitudes are different, range of the correlation coefficient threshold body has different values, but there is an intersection (0.0649, 0.0823) area in each threshold range. When the correlation coefficient is greater than any value in the threshold intersection, three IMFs number can be obtained, which is consistent with the component number of the original signal.

According to NMI method, normalized mutual information under different component amplitudes is shown in Table 3. It can be seen that there is no threshold range for the signal S4. Because when K = 4, the minimum value of NMI is 0.0658, which is larger than the minimum value of NMI 0.0579 when K = 3. In order to get a clearer understanding of the results, threshold ranges are plotted in Figure 4. It can be seen from analysis results of signal S4 that there is no threshold range. Therefore, there is no threshold intersection to determine the IMF number. So it is invalid to use NMI method for this signal.

According to CFSA method, center frequency histogram under different component amplitudes is shown in Figure 5. It can be seen that the dominant center frequencies are about 12.6 Hz, 150.8 Hz, and 1809.6 Hz, respectively. There are three dominant frequencies. In this way, 3 IMFs can be determined, which is consistent with the component number of original signal. The results show that the proposed method is effective for this kind of signal.

(a)

(b)

(c)

(d)
3.2.2. Influence of Component Frequency
Signals S5 to S8 were analyzed using four methods in this section. VMD parameters except K are initialized as follows.
According to MCFO method, the maximum center frequency trend under different component frequencies is shown in Figure 6. It can be seen that the maximum center frequency is different when the value of K is from 1 to 4. Center frequencies of the four signals tend to be stable after the value of K becomes bigger than 6 and no longer increases significantly. Therefore, the IMF number is 6; however, the results is inconsistent with the component number of original signal.

According to CC method, correlation coefficients under different component frequencies are shown in Table 4. The threshold range results are plotted in Figure 7. As can be seen, the intersection of the threshold ranges is 0.0591, 0.1007. When the correlation coefficient threshold is greater than any value in the intersection, it can be determined that the IMF number is 3. The results is consistent with the actual components number of simulated signal.

According to NMI method, normalized mutual information under different component frequencies is shown in Table 5. It can be seen that there is no threshold range for signal S7. Because when K = 4, the minimum value of NMI is 0.0614, which is larger than 0.0599, the minimum value of NMI when K = 3. It can be seen from Figure 8 that there is no threshold range intersection, so the K value cannot be determined according to this method.

According to the CFSA method, center frequency histogram under different component amplitudes is shown in Figure 9. Three dominant center frequencies are obtained by signal S5 processing, which are approximately 12.6 Hz, 75.4 Hz, and 452.3 Hz, respectively. Another three frequencies (12.6 Hz, 100.5 Hz, and 804.3 Hz) can be obtained by processing signal S6. 12.6 Hz, 125.7 Hz, and 1255.3 Hz can be obtained by processing signal S7. 12.6 Hz, 151 Hz, and 1809.7 Hz can be obtained by processing signal S8. In summary, three IMF numbers can be obtained by the CFSA method, which is consistent with the actual component number of simulation signal.

(a)

(b)

(c)

(d)
3.2.3. Influence of Component Number
Signals S9 to S12 were analyzed by four methods in this section. VMD parameters except K are initialized as follows.
According to MCFO method, the maximum center frequency trend under different components number is shown in Figure 10. It can be seen that when the simulation signal contains 3, 4, and 5 components, the maximum center frequency trends are almost the same. However, when the simulation contains 6 components, the maximum center frequency trend is different from other situations. All maximum values of center frequency tend to be stable after the value of K becomes bigger than 7. Therefore, the IMF number is 7. But the results are inconsistent with the actual component number of the simulation signal.

According to the CC method, the correlation coefficients under different components number are shown in Table 6. The threshold ranges are plotted in Figure 11. As can be seen, there is no threshold intersection for determining the IMF number. Therefore, when VMD is used to decompose the signals S9 to S12, it is difficult for CC method to determine the IMF number.

According to the NMI method, the normalized mutual information under different components number is shown in Table 7. It can be seen from Figure 12 that there is no threshold range for the signal S9, because the minimum value of NMI is 0.0502, which is larger than 0.0436. Threshold ranges are plotted in Figure 13. As can be seen, there is no threshold intersection. So the NMI method cannot determine the IMF number when the signals S9 to S12 are processed by VMD.


(a)

(b)

(c)

(d)
According to CFSA, center frequency histogram under different components number is shown in Figure 13. From the center frequency histogram obtained by processing the signal S9, there are three dominant center frequencies (12.6 Hz, 151.1 Hz, and 1808.5 Hz). And there are four dominant center frequencies (12.6 Hz, 150.9 Hz, 627.9 Hz, and 1809.8 Hz) obtained by signal S10 processing. Five dominant center frequencies (12.6 Hz, 150.9 Hz, 627.5 Hz, 1129 Hz, and 1808.7 Hz) can be obtained by processing signal S11. Six dominant center frequencies (12.6 Hz, 150.8 Hz, 628.3 Hz, 1131 Hz, 1508 Hz, and 1809.6 Hz) can be obtained by processing signal S12. The simulation signals S9 to S12 contain exactly 3 to 6 components, respectively, so the CFSA is correctly verified.
3.2.4. Influence of Noise Amplitude
Signals S13 to S16 were analyzed by four methods in this section. VMD parameters except K are initialized as follows.
According to MCFO method, the maximum center frequency trend under different noise amplitudes is shown in Figure 14. It can be seen that when there is no noise, the maximum center frequency tends to be stable after the value of K becomes bigger than 2. When the noise amplitude is 0.05 and 0.1, the maximum center frequency tends to be stable from K = 6. When the noise amplitude is 0.15, the maximum center frequency tends to be stable from K = 7. Therefore, when the noise amplitude is different, IMF number determined by MCFO method is different.

According to the CC method, the correlation coefficients under different noise amplitudes are shown in Table 8. Different threshold ranges are shown in Figure 15. There is no intersection in the four threshold ranges, so the value of K cannot be determined. Therefore, when VMD is applied to decompose signals S13 to S16, it is not suitable for CC method to determine the IMF number.

According to the NMI method, the normalized mutual information under different noise amplitudes is shown in Table 9. Four threshold ranges are shown in Figure 16. There is no intersection of the four threshold ranges, so there is no mutual information threshold to determine the K. Therefore, the NMI method is ineffective in this case.

According to CFSA, center frequency histogram under different noise amplitudes is shown in Figure 17. As can be seen, there are three dominant center frequencies in results of signals S13 to S16, which corresponds to the actual components number of signals.

(a)

(b)

(c)

(d)
In summary, the MCFO and NMI methods cannot accurately determine the IMF number in case of four influencing factors changes. CC method can only determine the IMF number accurately under first two influencing factors changes. The proposed CFSA method can obtain the IMF number accurately under the variation of four influencing factors, which verifies the advantages and effectiveness of the proposed method.
4. Experimental Validation
To further verify the superiority of the proposed method, four methods were tested by rolling bearing experimental data from Case Western Reserve University [20]. The experimental data parameters are set as follows: the load is 3 horsepower and the sampling frequency is 12 kHz. 12000 data points were used for analysis. Time domain signal and FFT spectrum of normal (Normal_3), inner race fault (IR028_3), outer race fault (OR021@6_3), and ball fault (B028_3) state are as shown in Figure 18.

(a)

(b)

(c)

(d)
In the MCFO method, VMD parameters except K are initialized as . The maximum center frequency trend results are shown in Figure 19. As can be seen from Figure 19(a), the maximum center frequency fluctuates two times when K = 4 and 9, so the accurate K value cannot be obtained. As can be seen from Figure 19(b), the maximum center frequency tends to be stable from K = 5, so the number of IMF is 5. Similarly, it can be seen from Figures 19(c) and 19(d) that K = 3 in outer race fault and rolling ball fault situation.

(a)

(b)

(c)

(d)
In the CC method, VMD parameters except K are initialized as . Correlation coefficients were shown in Table 10. As can be seen, there exists no obvious threshold, and the fixed numbers of correlation coefficients are always greater than this threshold from a certain K. So the accurate K value cannot be obtained according to correlation coefficients of four state signals.
In the NMI method, VMD parameters except K are initialized as . The NMI results of four state signals were shown in Table 11. As can be seen, the K value cannot be determined according to the normalized mutual information values. Because starting from a certain layer, you cannot find a threshold that is less than a fixed amount of mutual information.
In the CFSA method, VMD parameters except K are initialized as . Center frequency histograms obtained by CFSA about four states of bearings were shown in Figure 20. As can be seen from Figure 20(a), the IMF number is 5 because there are 5 center frequencies whose counts are higher than the average. Similarly, the IMF number in the other three situations is 4.

(a)

(b)

(c)

(d)
In order to quantitatively evaluate the accuracy of the proposed method, the accuracy of the methods can be evaluated by error which can be expressed aswhere are the main frequencies of FFT spectrum. represents center frequencies obtained by four methods. represents the amplitude corresponding to . is the number of center frequencies.
According to the K values determined by four methods, the bearing vibration signals in four states are decomposed by VMD, and the center frequencies of IMFs in each state are obtained. The error of each method is calculated by formula (9),and the results are shown in Table 12.
As you can see from the table, CC and NMI do not exist because these two methods do not get a K value. In Normal_3 state, the value of K is 5 obtained by CFSA method, while MCFO methods cannot obtain K value. In IR028_3 state, the value of K is 4 obtained by CFSA with no difference between center frequencies of FFT. However, the value of K is 5 obtained by MCFO with 2.2% error. In the state of outer race fault and ball fault, the K value obtained by CFSA method is 4, and the center frequencies results are completely consistent with the main components of FFT spectrum. However, the K values obtained by MCFO method are 3, and the error of outer race and ball fault vibration signals processing results are 1.3% and 0.7%, respectively. To sum up, the CFSA method can obtain the appropriate IMF number when VMD is used to process the bearing vibration signals.
5. Conclusion
This study proposes a novel method based on IMFs center frequency to determine IMF number of VMD. Compared with MCFO, CC, and NMI methods, the proposed method CFSA can accurately obtain the K value from complex simulation signal with variable components. The proposed method was demonstrated to be effective for processing rolling bearing vibration signal. Center frequencies obtained by processing the vibration signals of bearing in normal and fault state are consistent with the main components of FFT spectrum. Results show that the proposed method has high robustness and accuracy. In future work, the effectiveness of the proposed method needs to be further verified by processing other types of vibration signals, for example, vibration signal of planetary gearbox or wind gearbox.
Data Availability
The data used to support the findings of this study can be found at http://csegroups.case.edu/bearingdatacenter/pages/download-data-file.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This research was funded by the National Natural Science Foundation of China (grant nos. 51875576, 51875575).