Abstract
The idea of diesel engine health evaluation is put forward, and the calculation flow, fast algorithm, and influencing factors of wavelet fractal dimension are analyzed. Through the real vehicle experiment, the corresponding relationship between the fractal dimension of the wavelet and the health condition of the diesel engine is calculated and analyzed. Three quantitative expression parameters, age health degree, dimension health degree, and comprehensive health degree, are defined considering the age and working condition of diesel engine. Finally, the consistency of health degree with vibration intensity, acceleration time, deceleration time, and fuel consumption was confirmed. It has been proved that it is feasible to evaluate the health of diesel engine with health degree.
1. Introduction
Evaluating the health state of dynamic system based on the vibration signal has the advantages of high speed, high precision, wide range, easy signal measurement, etc. It can realize online monitoring and is especially suitable for state evaluation of moving parts such as diesel engines and gearboxes. However, vibration analysis also has certain difficulties because of the following reasons: The number of vibration excitation sources is large, the frequency distribution of vibration is wide, and the mutual interference cannot be avoided. The number of internal moving parts is large, the structure is complex and closely matched, and it is difficult to access under normal working conditions. The number of vibration transmission paths is large and the transmission characteristics are complex.
On one hand, the peak, mean, root mean square, variance, standard deviation, cubic moment, fourth moment, waveform factor, pulse factor, and margin factor of the time domain characteristic parameters of vibration can be directly used for online evaluation because the calculation is relatively simple and fast. On the other hand, with FFT algorithm, autocorrelation, coherence, partial coherence, cepstrum, third-order spectrum, correlation spectrum, refinement spectrum, and cross-correlation, cross-spectrum, coherence analysis, transfer function, and other spectrum methods provide very useful feature information for condition monitoring and fault diagnosis. Combining fault FFT spectrum analysis with time domain analysis in the fault diagnosis makes it easier to highlight the fault characteristics.
In 1984, Randall conducted spectral trend analysis for condition monitoring of mechanical equipment [1]. In the same year, Bosmans used spectrum analysis to carry out detection and early diagnosis of potential failures of rotating machinery [2]. In 1989, Scheibel proposed histogram analysis technology: the rotor vibration time domain waveform in normal operation state and the rotor vibration time domain waveform in the fault state are subtracted, and then the FFT spectrum is obtained to reflect the rotor vibration information caused by the fault [3]. In 1989, He of Xi’an Jiaotong University proposed several methods for reprocessing power spectrum improving the ability of analysis and recognition of power spectrum [4]. In 2005, Han proposed a full information cepstrum analysis method based on homogenous data fusion of rotating machinery [5]. The concept of full information analysis, which is successfully extended to the analysis field other than frequency domain analysis, has achieved obvious results when applied to the fault diagnosis of high-speed rotating power gear.
The idea of wavelet analysis comes from the scaling and translation methods. Wavelet transform is known as “mathematical microscope,” having the functions such as zooming in, zooming out, and translation and studies the dynamic characteristics of signals by examining changes at different magnifications. As a time-frequency analysis theory, wavelet analysis is considered to be a new stage in the development of Fourier analysis. Its good analytical characteristics make wavelet transform a powerful new tool for signal processing. Wavelet transform has good localization characteristics in both time domain and frequency domain. It fully embodies the idea of adaptive resolution analysis that the time window of high-frequency information is narrowed, and the time window of low frequency information is widened. As wavelet transform deals with frequency domain information in terms of frequency bands rather than frequency points, the processing will not be greatly impacted when the signal frequency is slightly fluctuating and contains nonintegral harmonics. Wavelet packet technology realizes arbitrary precision division in the frequency domain, which undoubtedly provides a powerful processing tool for harmonic analysis. From the temporal locality of the wavelet transform, it is known that when the local fluctuation of the signal occurs, it does not spread the influence to the entire frequency band like the Fourier transform, but changes the spectrum analysis at that time for a short period of time.
Wavelet analysis has been successfully applied in lots of fields such as quantum field theory, seismic exploration, computer vision, speech recognition and synthesis, radar, CT imaging, fluid turbulence, mechanical fault diagnosis and monitoring, and fractal and digital television. In 1993, Wu from Huazhong University of Science and Technology discussed the feature extraction and recognition method based on small slope transform, the feature extraction of the maximum point of the fundamental wavelet transform modulus, and the feature refinement method based on the evolution of the magnitude of the modulus maximum point along the scale S and constructed the global similarity measure, from the application point of feature signal pattern recognition in the field of equipment diagnosis [6]. In 1997, Chen et al. of Xi'an Jiaotong University proposed an understanding of wavelet analysis from the perspective of engineering application and analyzed the mechanical failure of rolling bearings and gearboxes with the wavelet packet algorithm [7]. In the same year, Zhao proposed generalized adaptive wavelets based on the adaptive wavelets analysis theory and applied adaptive wavelet analysis of vibration signals realized by genetic algorithms to diagnose the internal-combustion engine [8]. In 1998, Xiang Yang reconstructed and analyzed the engine cylinder pressure signal after using the filtering performance of wavelet transform to remove the channel effect, and Wang discussed the method of diagnosing and identifying reciprocating mechanical faults by combining wavelet multiresolution analysis with information entropy from the application of characteristic signal processing in the field of reciprocating mechanical fault diagnosis [9, 10]. In 1999, Wang extracted fault information in the vibration signal of rolling bearing under strong background noise and diagnosed the early failure of rolling bearing by combining envelope analysis with wavelet transform [11]. In the same year, Xu used wavelet analysis technology to diagnose the diesel engine bearings through the vibration signal of the diesel engine. In 2000, Liu gave an improved algorithm for wavelet packet transform overcoming the frequency aliasing phenomenon in the original algorithm by frequency shift processing, aligning the order of the decomposition sequence with the linear division order of the frequency band, and realizing the rapid extraction of the whole cycle diagnosis feature vector through the wavelet packet decomposition of the cylinder head vibration signal in the complete working cycle [12]. In 2006, Tang proposed a vibration signal preprocessing method based on the second generation wavelet transform, which is to select the best predictor for each sample by analyzing the local characteristics of the signal with the minimum prediction variance as the target, making the wavelet basis function always match the local features of the signal [13]. In recent years, more and more scholars have used wavelet theory as a basic tool for vibration signal analysis, and there has been a trend to combine wavelet theory with other analysis methods such as fuzzy theory and neural networks.
Fractal is an active branch of nonlinear science. The initial form of fractalism is fractal geometry, of which the main idea is self-similarity between local and global, the research object is the nonsmooth and nondifferentiable geometry generated in the nonlinear system, and the corresponding quantitative parameter is the dimension. The birth of fractalism provides people with a new way of understanding complex things, which is an epistemology from the whole to the local and from macro to micro. Fractal dimension, which is the most basic parameter to describe fractal, is different with the traditional integer dimension. It will show singularity and complexity according to traditional experience when investigating a fractal system, while it will show positive and ordinary results when the fractal system is understood with the concept of fractals. Fractal dimension, an important parameter to quantitatively describe the degree of “singularity” of chaotic attractors, is widely used in the quantitative description of nonlinear system behavior, measures the ability of the system to fill the space, and describes the disorder of the system from the aspects of measure theory and symmetry theory [14–16].
In 1973, based on the research of Cantor, Peano, Hausdorff, and others, Mandelbrot put forward the idea of fractal geometry, breaking through the notion that dimension can only be an integer and considering the complex object to be a fractal dimension. It is a new trend that the fractal theory is applied to the field of mechanical system condition monitoring and fault diagnosis in academic circles in recent decades. In 1997, Li applied the fractal theory to the fault diagnosis of automobile engines and achieved good results in his doctoral thesis [17]. In 1998, Lü of Beijing University of Science and Technology studied fractal dimension and the application in fault diagnosis of rolling bearings, using fractal dimension as the feature quantity to identify rolling bearing faults. In 2000, Wang of Shanghai Jiaotong University proposed the technical route and basic calculation method of using fractal theory for fault diagnosis in the research of the correlation dimension’s application in the fault diagnosis of large-scale units [18]. In 2001, Xia, introducing the fractal theory into the vibration diagnosis of internal-combustion engine, diagnosed and classified the fault after studying the data of the corresponding combustion section in the cylinder head vibration signal according to the timing of the internal-combustion engine and using the calculated correlation dimension to describe the nonlinear behavior of the cylinder head of the internal-combustion engine in different states of the valve [19]. In the same year, Shi discussed the wavelet fractal method for mechanical fault diagnosis systematically. In 2002, Teng further studied the effect of noise on the results of correlation dimension calculations [20]. In 2003, Hou studied the correlation dimension of generator sets under different working conditions and gave the criteria for fault diagnosis of steam turbine generator sets with fractal theory, which provided online monitoring and fault diagnosis a reliable tool for mechanical equipment [21]. In 2004, Huang proposed the multifractal generalized dimension to describe the signal characteristics for the characteristics of rotating machinery faults and proposed a fault diagnosis method for rotating machinery with fault correlation coefficient based on the vibration signal pattern space sample database [22]. In 2006, Xu proposed a new multifractal feature, parameter-sensitive dimension in “Fractal Characteristics and Fault Diagnosis Method of Mechanical System Dynamics,” which was applied to the fault diagnosis of steam turbine generator set [23].
The birth of fractal theory provides a new way for time series analysis. By studying the fractal behavior of time series, we can analyze the characteristics and laws of time series from a new perspective. In addition, the combination of wavelet theory and fractal theory becomes possible since wavelet and fractal have the same meaning from the essence of the idea and the implementation method. In fractal theory, fractal dimension is an important characteristic parameter of nonlinear dynamical system, which quantitatively describes the complexity of the system. Among many fractal dimensions, the correlation dimension is sensitive to the inhomogeneous response of the attractor and reflects the dynamic structure of the attractor at a deeper level [24]. The more uncertain the components of the system, the larger the correlation dimension; the more deterministic the components of the system, the smaller the correlation dimension. Therefore, the correlation dimension is selected as the characteristic parameter of the diesel health evaluation [25]. This paper focuses on the fast algorithm of correlation dimension and feature extraction based on wavelet fractal theory, in which phase-space reconstruction and calculation of correlation summation are the key technologies. Phase-space reconstruction is to recover the state information of complex system contained in the time series [26]. The time delay method is used here. The selection problem of the reconstruction parameters τ and m is mainly discussed. The distance measurement method is selected as the better calculation of correlation summation and the nearest neighbor search technology of point pairs, which is used to improve the G-P algorithm, reduces the computational time greatly [27–29]. The efficiency of the fast algorithm of correlation dimension is proved by an example, and the factors such as signal interception and noise affecting the correlation dimension are analyzed. The correlation dimension is calculated in a specific frequency segment more accurately and quantitatively after the vibration signal is decomposed by wavelet packet according to the characteristics of wavelet theory and fractal theory. Finally, the concept of dimension health degree, age health degree, and comprehensive degree is proposed, and it is verified that using the wavelet correlation dimension to evaluate the health of diesel engines is feasible.
2. Meaning of the Health Condition Evaluation of Diesel Engine
The health evaluation of diesel engine comes from the condition monitoring and fault diagnosis field of diesel engine. Condition monitoring and fault diagnosis technology includes grasping and understanding the condition of equipment in use, confirming its overall or local normal or abnormal, finding fault and ascertaining the location, reason, nature and degree of fault, etc. [30, 31]. The early maintenance system of diesel engine is basically the Breakdown Maintenance that the diesel engine will not be repaired until something wrong has occurred, which often causes huge economic losses. Subsequently, the periodic maintenance system was gradually implemented, and the contradiction between “Insufficient Maintenance” and “Excessive Maintenance” caused by this system became increasingly prominent. In order to solve this contradiction, the concept of CBM (Condition-Based Maintenance), which is a new maintenance method of diesel engine, has been widely accepted. By real-time monitoring diesel engine’s working condition and working environment, it analyzes diesel engine working condition, forecasts the effective working life and diagnoses possible faults of diesel engine, and thus formulates reasonable maintenance plans for diesel engines to improve the reliability, safety, and effectiveness of diesel engine operation [32]. Under this trend, the concept of health evaluation of diesel engine was put forward.
The evaluation of the health condition of diesel engine is a technology to quantitatively evaluate whether the function of diesel engine components is normal, whether the performance is stable, and confirm the state of operation, the dynamic performance, and the efficiency of the diesel engine. Its characteristics are mainly reflected in the following aspects:(1)Health evaluation does not analyze the fault with obvious characteristics, but mainly evaluates the performance of diesel engine. For example, in cylinder pulling, there is no need for complex testing, analysis, and diagnosis to confirm as it is easy to make accurate judgment from the sound. Previously, people classified the state of diesel engines into normal state, fault state 1, fault state 2, and so on. In fact, the health state of the diesel engine is not the same, and there are even great differences when there are no faults, as different health conditions contain much abundant information, which can not only be used to evaluate the current working ability of the diesel engine, but also to predict the future situation.(2)Health condition evaluation is different from technical status estimate and the same technical status does not mean the same health condition such as 50-year olds with normal body organs are different from 20-year olds, because they are in different age groups, and the remaining life span and working time are different.(3)Health condition is different from motor hours recorded in the tally book. Motor hours are like the years of life of people and 20-year olds who suffer from cancer are far worse off than 50-year olds who are in good health.(4)Quantitative evaluation: through the correlation dimension, motor hours, and other characteristic parameters, the quantitative evaluation index of the health condition of diesel engine is established, which is more accurate than the qualitative technical condition evaluation.
3. Calculation of the Fractal Dimension
Fractal dimension, which is widely used in the quantitative description of nonlinear system behavior, is an important parameter for quantitatively describing the singularity of chaotic attractors. In the field of mechanical equipment fault diagnosis, people achieved some results by analyzing the measured signals with fractal dimension. The research data show that the fractal dimension can be used as a feature to describe the state of mechanical components and systems such as bearings, gears, engines, etc. Among the numerous fractal dimensions, the correlation dimension is sensitive to the inhomogeneity of attractors and can better reflect the dynamic structure of attractors. The more uncertain the components of the system there are, the greater the correlation dimension is. The more deterministic components of the system there are, the smaller the correlation dimension is.
The probability that the point set Ω falls into the N(r) cube whose side length is r is Pk, and the correlation dimension of the point set Ω is
This definition reflects the degree of correlation with each other.
Note that the total number of Ω midpoints is M and the number of point pairs of which the distance is less than r in Ω is m (r, M), and the correlation function is defined as
If the number of points in Ω falling into the kth hypercube is nk (r, M), there is a probability
The nk points in the kth hypercube constitute the nk2 – nk pair, if the point pairs that are less than r from each other in the adjacent cubes are not counted, then
According to equation (1), the correlation dimension is
This shows that the correlation dimension reflects the degree of correlation of the points in Ω.
3.1. Phase-Space Reconstruction Techniques
The space formed by independent variables in the system is called the phase-space. The dynamic characteristics of the system can be understood through the analysis of the phase-space of the system, but it is often the time series of univariate with equal time interval in the actual problem. The traditional method, which is to analyze the time directly from this sequence, has great limitations. Because the time series contains traces of all the variables involved in the movement, it reflects the interaction of many physical factors comprehensively. Moreover, actually the sequence contains information about chaotic motion, of which the dimensionality of dynamic system is at least three, though it seems to be random from a formal point of view. Therefore, that the chaotic information of the time series extended the time series into the phase-space of three or even more dimensions is fully revealed is the phase-space reconstruction of the time series.
The time delay reconstruction method is used currently widely. In 1980, Packard proposed reconstructing an “equivalent” phase-space from a one-dimensional observable to reproduce the dynamic characteristics of the system while Takens laid a solid foundation in mathematics. The basic point of view is that the change of a variable of the dynamic system, of which the changes over time imply the dynamics of the entire system, is naturally related to the interaction of this variable with other variables of the system although the phase-space is formed with only a variable at different times. Therefore, the trajectory of the reconstructed phase-space also reflects the evolution of the state of the system. The principle is as follows.
A set of phase-space vectors is obtained after the m-dimensional phase-space of the given time series is reconstructed.where is the time delay, ; d is the number of system arguments, M is less than N, and they are on the same order of magnitude.
Phase-space reconstruction, of which the focus is the choice of reconstruction parameters, embedding dimension m and time delay , is the basis of calculating fractal dimension.
3.1.1. Determination of Time Delay
The mutual information method, an effective method for estimating the time delay of reconstructed phase-space, has a wide range of applications in phase-space reconstruction [33]. Shaw first proposed the lag time when the mutual information reaches the minimum for the first time as the time delay of phase-space reconstruction and Fraser gave the recursive algorithm for mutual information calculation.
Let the time delay of time series s1, s2, …, sk, be (T = ), the embedding dimension is m, reconstructing phase-space
Then the average information amount of the system to the variable s is the entropy H of the system.
Considering a total coupled system (S, Q) and assuming that s is known as , then the uncertainty of q is
is the conditional probability.
Let q be known at time t, then the average uncertainty of x at time t + T is
In the above equation,where is the uncertainty of isolated q, while is the uncertainty of q of the known s. Therefore, the known s reduces the uncertainty of q. So the mutual information is
Mutual information is not a function of the variable s or q, but a function of the joint probability distribution , which is the overall measure of the distribution . If Q is the delayed phase of S, then the graph of the delayed phase gives an estimate of the distribution .
The basic difficulty in computing mutual information is to estimate the distribution from the histogram. Usually, the following method is used: set a box of size on the s-q plane; then there iswhere are the number of points in the box and is the total number of points.
For the general case, mutual information is
If the vector is a delay time reconstruction, the lag time when reaches the minimum at the first time can be used as the time delay for phase-space reconstruction. At the same time, if n is sufficiently large (eg greater than the number of freedom degrees of the system), should be a monotonically decreasing function and gives a good estimate of the system’s entropy.
3.1.2. Determination of the Embedding Dimension m
Assume that the time series is , and the reconstructed delay vector is [34]where m is the embedding dimension and is the time delay. indicates the ith reconstruction vector when the embedding dimension is m. Definition aswhere means the maximum norm.
is the ith reconstruction vector when the embedding dimension is m + 1.
is an integer, such that is the nearest neighbor of in the m-dimensional reconstruction space, and is dependent on i and m:(a) in the numerator and in the denominator of the equation (16) are the same.(b)If , the next nearest neighbor is used instead.
The two points in the reconstructed space with the embedding dimension m + 1 are similar, if m is an embedding dimension that conforms to the embedding theory and any two points in the reconstructed space with the embedding dimension m are similar. Such point pairs are called true neighbors; otherwise, they are false neighbors.
The perfect embedding means that there are no false neighbors, which is the idea of false neighbors, judging false neighbors by comparing whether they are greater than a given threshold. is defined as
One problem of this method is the choice of threshold. It can be seen from the definitions of the above equation and in equation (16) that the threshold should be determined by the original signal. Theoretically, thresholds of corresponding to different spatial points i are different, and different time series also have different thresholds. This suggests that it is difficult or even impossible to give an accurate and reasonable threshold because the threshold is independent of each orbit point i, dimension m, and time series data.
To avoid the problem above, define and the average of all : only depends on the dimension m and the delay . To study the change of from m to m + 1, define
It can be found that for a time series, m0 + 1 is the minimum embedding dimension sought when does not change any more with the embedding dimension m gradually increasing to a certain value m0.
Regardless of the methods, the parameter is a necessary parameter for phase-space reconstruction and must be determined prior to the minimum embedding dimension m. The embedding dimension m is independent of the time delay in theory while the minimum embedding dimension m depends on the time delay in reality. Different time delays will result in different minimum embedding dimensions, especially in the case of continuous time series. Obviously, choosing a suitable time delay can reduce the minimum embedding dimension m in phase-space reconstruction. This is also one reason to choose a suitable time delay to make the reconstructed vectors as independent as possible.
Next, define another quantity to distinguish between deterministic signals and random signals:where the meaning of is the same as above and is the nearest neighbor of in the m-dimensional reconstruction space. Define
In theory, E1 (m) of a random time series will not saturate as m increases. However, in the actual calculation, if m is large enough, E1 (m) will increase slowly or even stop changing. In fact, since the available sampled data is finite, even if the time series is random, E1 (m) will stop changing after m reaches a certain value. To solve this problem, E2 (m) is adopted. For the random sequence, E2 (m) is equal to 1 regardless of the value of m, since the future value and the past value are independent of each other while E2 (m) is related to m for the deterministic sequence. So, E2 (m) is not a constant for different m, and there must be some m let .
3.2. G-P Algorithm for Correlation Dimension
The G-P algorithm for correlation dimension is proposed by Grassberger and Procaccia. , a time series obtained by observing a system, is phase-space reconstructed and embedded in m-dimensional Euclidean space. We can obtain a set of points (or vectors) whose elements are recorded as
For example: N = 10, m = 3, = 2, thenwhere , a fixed time interval, is called the time delay. is the time interval between two adjacent samples. k is an integer.
Arbitrarily select a reference point Xi from these Nm points and calculate the distance from the remaining Nm − 1 points to Xi.
Repeat this process for all Xi (i = 1, 2, …, Nm), the correlation integral function is obtained:where H is the Heaviside function:
Let be the maximum stretch distance of the attractor in the m-dimensional space.
When ,
When , .
It can be seen that the correlation integral reflects the distribution probability of the distance between points in the attractor, so there should be is an constant related to m and r.
For small distances r1 and r2,
Taking the logarithm of both sides of the above formula,
When is very small, . Therefore, according to the above formula,where is the slope of the curve . When ,
When the value of does not change with the increase of the phase-space dimension m,
It is the correlation dimension of the time series.
3.3. Calculation of Correlation Summation
It can be seen from the G-P algorithm that the calculation process of the correlation dimension is not complicated, but the amount of calculating the distance of a large number of point pairs is large and the calculation time consumption increases exponentially with the increase of data volume. When the time series length is 20480 points, the G-P algorithm takes about 32 minutes to calculate the correlation dimension.
In the calculation of correlation dimension and the choice of parameters in phase-space reconstruction, the most important part and the most time-consuming part of computation is the calculation of correlation, which represents the spatial correlation of points, and of which the core is essentially the probability that the distance between two arbitrary points is less than r. The G-P algorithm improves the computational efficiency in two aspects, distance calculation and point pair search.
3.3.1. Distance Measurement of Point Pair
The distance between point Xi and point Xj is
Correlation integral,
Formula (38) is a recursion formula commonly used when seeking correlation dimension. However, in this way, the computation time will be very large when the length of time series is large, and there will be unexpected problems due to too many intermediate repeats and rounding errors in computer programming.
The calculation of space point spacing in formula (38) is based on the sphere field of Euclidean space Rm.
If the distance formula in the diamond field equivalent to the distance topology in the sphere is used, or the distance formula in the square field is adopted, the distances between point Xi and point Xj in the m dimension space can be represented as
In addition, since , and considering when i = j, the expression of correlation integral can be written as
Then the corresponding formulas (39) and (40) will be changed as
In this way, the calculation of correlation Cm (r) in the adoption of (4) and (5) is easier to implement in computer programming, and the processing speed of its program is also several times faster than that of the (38) algorithm.
3.3.2. Nearest Neighbor Search of Point Pairs
When calculating association, the following two methods are commonly used in point-to-point search.
Method 1. Calculate the distance between all point pairs in the phase-space, and the time complexity is O (N2). It can be seen that this method is time consuming, especially when the amount of data is large.
Method 2. As shown in Figure 1(a), calculate only the distance less than the threshold r, of which the decrease will greatly reduce the amount of data. However, its computational efficiency is still insufficient to meet the needs of online evaluation.
In practical applications, we can learn from the idea of the K nearest neighbor search in computer search technology to do point pair search [35]. K-NN algorithm is a traditional statistical pattern recognition method of which the basic idea is as shown in Figure 1(b); in the D dimension points, where the distance can be Euclidean or diamond or square, the K nearest neighbor computations of reference point q is to find K points closest to the point. In addition, in order to improve the speed of search further, the accuracy of some search can be reduced by allowing small relative error , which is the approximate nearest neighbor search. A large number of experiments show that, for high-dimensional datasets, the search speed will be greatly improved even if minimal approximation retrieval error is allowed.
In the actual point search, the time complexity is O (N) and the time consumption is greatly reduced by choosing the reference point and computing the distance between the reference point and the nearest point.

3.4. Fractal Scaleless Range
The fractal system we studied is not self-similarity in strict sense, but self-similarity in approximate or statistical sense, which only exists in a certain range of scale changes. The range of scale changes, beyond which the self-similarity will not exist, is scale-free interval. Therefore, the determination of scale-free interval is very important. In the numerical analysis of fractal system, once the scale-free interval of fractal system is exceeded, the calculated fractal dimension has no practical significance.
The scale-free interval can be ascertained by the three-fold line method, of which the basic idea is to take different values from small to large [36]. On the plane of , assuming that the distribution of point columns is concentrated near the broken line composed of three lines as shown in Figure 2, the middle section corresponds to the scale-free interval. Therefore, the criterion for determining scale-free intervals is to use three-fold line segments to fit these points to minimize the overall error, that is, to use three-fold line segments for optimal approximation.

The requirement of scale-free interval is the minimum scale and the maximum scale to make the correlation dimension formula valid. Equivalently, the lower and upper endpoints of the middle section of the graph are required to be numbered.
The mathematic model of the problem is to find the ordinal number n1, n2 (1 < n1 < n2 < n) when is minimum in the formula:
Recorded as
Obviously, the condition that the reaches the minimum is that , , reaches the minimum. Thus, the coefficients , in equation (44) can be derived from the system of equations:
So,
Regression analysis shows that the sum of squares of minimum residual error is
In order to find the endpoint of scale-free interval, a recursive formula for determining the optimal segmental point is established:
According to equations (38) and (49) formulas, it can be deduced that
Thus, the approximation error and the optimal approximation error of any two-segment polyline of points are, respectively,
The approximation errors and the optimal approximation errors of any three-segment polyline are, respectively,
The error matrix of the optimal segment is recorded as . When , can be stipulated by formula (49); then and error matrix could be determined:where B is a upper triangular matrix. After calculating B from formula (54), the optimal two-fold line vector and the first segment point ordinal number vector can calculated:here, the expression of is is the serial number of the corresponding segment point of , s = 4, …, n.
Similarly, the optimal trilinear vector and the second piecewise point ordinal vector are, respectively,
In the equation, is the serial number of the corresponding segment point of , s = 5, … ,n.
Firstly, can be calculated by formula (53) and corresponding to formula (59) is the target n2. Replacing l in the formula (55) by the n2, can be calculated and corresponding to formulas (55) and (56) is the target n1. Then, is located in the scale-free interval.
3.5. Calculation Process of Dimension
The correlation dimension is selected as one of the characteristic parameters of diesel engine health evaluation. The calculation flow chart is shown in Figure 3 and the main calculation contents are as follows:(1)Signal preprocessing. Because noise has a great influence on the calculation of correlation dimension, we can use the wavelet method to reduce the noise of vibration signals firstly. Then, we normalize the signals with different magnitudes and process them with zero mean.(2)Phase-space reconstruction. It mainly includes the time delay parameter τ selection and inserting dimension parameter m selection in addition. For the same vibration signal, the same parameters can be selected for phase-space reconstruction [37].(3)Calculation of correlation. In the calculation of correlation dimension, the most important part and the most time-consuming part of computation is the calculation of correlation, which represents the spatial correlation of points. The correlation, which mainly includes point pair search and distance calculation, is essentially the probability that the distance between two arbitrary points is less than r.(4)Calculation of the correlation dimension. It mainly includes drawing the ln C(r)-ln r curve, determining the scaleless range, and calculating the slope to get the correlation dimension.

3.6. An Example of Computing Correlation Dimensions
Taking the cylinder head vibration signal of a tank diesel engine as an example, the correlation dimension is calculated. Figure 4 is the original vibration signal sampled. The sampling frequency is 20 kHz and the sampling time is 1 second. The vibration signal measuring point is in the first cylinder of the left row of diesel engine. The testing speed of 500 r/min is controlled by the driver according to the tachometer, and there will be some errors. The accurate value of the tank diesel engine can be calculated from the signal of the top dead center or from the vibration signal. From Figure 5, the cylinder works for 4.5 weeks. In the period of 4 cycles, the number of points is 17572 and it can be calculated by the formula , then .


Figure 6 is a noise-free vibration signal processed by standardization. The standardization process is to convert signals with different amplitudes and mean values into standard signals with amplitudes of 1 and mean values of 0.

Figure 7 is to determine the time delay by self-mutual information method, finding the first minimum point of the curve. As seen from the figure, the abscissa corresponding to the first minimum point of the curve is 7 so the optimal delay time .

Figure 8 is Cao’s method of calculating the minimum embedding dimension m. Decision method of minimum embedding dimension: When E1 (m) increases slowly or no longer changes as m increases, m0 + 1 is the minimum embedding dimension. In practical calculation, when the slope of curve E1 (m) is less than 0.005, it is considered that E1 (m) increases slowly. The minimum embedding dimension can be calculated from the graph: m0 + 1 = 9 + 1 = 10.

Figure 9 is an double logarithmic curve, and the correlation coefficient between A-B is the largest, which is 0.99936. Therefore, A-B is the scale-free interval, between which the slope of the fitting line is the correlation dimension d2 = 2.316.

The length of the data is 20480, and the total calculation time of correlation dimension is about 28 seconds, among which the selection of reconstruction parameters takes 15 seconds and the calculation time of correlation dimension is 13 seconds. The same data takes 32 minutes calculated by the G-P algorithm. This shows that the efficiency of the optimized correlation dimension calculation method is much higher than that of G-P algorithm. In addition, the calculation time of correlation dimension is closely related to the length of data. The larger the amount of data, the more the time consumed. Therefore, we should use the least amount of data to calculate the correlation dimension as far as possible, but too little amount of data will bring great calculation error. Therefore, the length of vibration signal should be reasonably selected in practical application.
4. Influence Factors of Dimension
4.1. Influence of the Signal Interception Position
Take a tank at 200 motor hours as an example to analyze in detail [38]. The test is carried out in situ at the stable speed of 500 r/min, and the sampling frequency is 20 kHz. The second cylinder in the left row of the diesel engine is selected to research the cylinder head’s vibration signal. The waveform is shown in Figure 10.

It can be seen from Figure 10 that five bursts of the second cylinder in the left row of the diesel engine were measured in one second, and the three complete burst processes in the middle, of which the burst peaks’ positions were the 5230th point, 9596th point, and 13932th point respectively, were selected for analysis. Since the length of each burst is between 4300 and 4400, the intercept length is chosen to be 5000. According to the fast algorithm of correlation dimension, the optimal phase-space reconstruction parameters are calculated that the time delay and the embedding dimension m = 7. Then, calculate the correlation dimension of each interception position of each explosion process separately. The selection of the interception position and the calculation result of the correlation dimension are shown Figure 11.

It can be seen from Figure 11:
The correlation dimension values of diesel engine cylinder head vibration signals vary with the position of interception, and some even differ greatly. Therefore, we should select appropriate position of signal interception when calculating the correlation dimension of vibration signals of reciprocating machinery such as diesel engines.
Under certain conditions, the calculation of correlation dimension is repeatable. The selected interception position should put the burst peak in the middle to improve the repeatability of correlation dimension calculation. In practical applications, the vibration signal can be intercepted by simultaneously collecting the top dead center signal.
Series 2 and series 3 have almost the same correlation dimension at the interception positions of 3, 4, 5, 6, 9, and 10 while the correlation dimension of series 1 differs greatly with series 2 and series 3 at all interception positions. The main reason is that the burst lengths of series 2 and series 3 are 4336 and 4337, respectively, which are almost equal, while the burst length of sequence 1 is 4366. Under certain conditions, the calculation of correlation dimension is repeatable. The selected interception position should put the burst peak in the middle to improve the repeatability of correlation dimension calculation.
4.2. Interception Length of Vibration Signal
The correlation dimension of the vibration signals of different intercept lengths may also differ. When processing and analyzing the data, we take the location of the peak of the explosion as the center and take the actual length of the explosion as the benchmark to reduce and increase the length of the intercept signal and then calculate the correlation dimension, respectively. Figure 12 is plotted according to the above data and method.

As can be seen from Figure 12, when the interception length of vibration signal is greater than or equal to the burst length, the change of correlation dimension tends to be gentle. Therefore, in order to improve the repeatability of correlation dimension calculation, the interception length of the signal should be greater than the length of adjacent peaks and contain only one peak.
4.3. Influence of the Signal Interception Method
The results are repeatable when the correlation dimension of the vibration signal is calculated with the peak value as the center and the actual length of the explosion as the intercepted length. However, in practical application, the vibration signal of diesel engine cylinder head sometimes appears the situation that the burst peak value is difficult to ascertain. In order to solve this problem, the vibration signal can be intercepted with collecting one TDC (Top dead center) signal simultaneously in the sampling process. Taking a 12-cylinder diesel engine as an example, the working sequence of the cylinder is left 1 ⟶ right 6 ⟶ left 5 ⟶ right 2 ⟶ left 3 ⟶ right 4 ⟶ left 6 ⟶ right 1 ⟶ left 2 ⟶ right 5 ⟶ left 4 ⟶ right 3, and each operation cycle has two signals of the TD located in the left 1 and left 6 cylinders. The length of each 2 adjacent TDC signals can be divided into 6 equal parts; then the signal can be intercepted with the positions of each equal part as the centers and the length of the adjacent TDC as the signal length. Figure 13 shows the interception method for the sixth cylinder head’s vibration signal in the left row.

4.4. Effect of Noise
No matter how precise the measuring instrument is, the actual collected data will inevitably be disturbed by system noise and measurement noise [39]. In order to analyze the influence of noise on the correlation dimension calculation, John Argyris studied the common attractors, such as Lorenz, Róssler, Chua, Duffing, 2-torus, 3-totus, Logistic, and Hénon. Taking the Lorenz signal superimposed with white noise as an example, Shi Boqiang analyzed the correlation dimension of noise occupancy rate as 0%, 1%, 10%, and 100%, respectively, in reference [40]. Their research has concluded that the presence of noise will lead to an increase in the correlation dimension D2.
4.5. Effect of Diesel Engine Working Conditions
There is a certain relationship between the correlation dimension and the rotating speed according to the above analysis. Theoretically, the chaotic motion of diesel engine becomes more complex with the increase of speed, and the correlation dimension of vibration signal will also increase correspondingly.
According to the experimental data, the correlation dimension increases gradually with the increase of rotating speed under the condition that the vehicle is in situ. Practically, the correspondence between speed and correlation dimension is affected by other factors, such as the vehicle movement and gearshift location.
5. Feature Extraction Based on Wavelet Fractal Theory
The wavelet packet transform gradually gives the waveform of the vibration signal at different scales from coarse to fine, and its transition principle from low resolution to high resolution is consistent with the fractal process from the general to the local, from macro to microdeepening.
Multiresolution analysis is implemented by the Mallat algorithm, which theoretically represents the fractal generated by the fractal iteration if viewed from the reverse. H. G. Stark simplified the scale recursive iteration formula in the Mallat algorithm toand constructed a lot of fractals.
In addition, the fractal theory holds that the whole thing and its components have self-similarity, including strict self-similarity, approximate self-similarity, and statistical self-similarity. According to the fractal theory, the set F can be generated by a function with a compact set [41].where r is a self-similar affine operator, H is a parameter related to the dimension, and both of them are greater than zero.
It can be known from wavelet theory that wavelet is a function family generated by mother wavelet through stretching and translation.where a is the scaling factor and b is the translation factor.
It can be said that the wavelet and its mother wavelet have self-similarity, as the self-similar affine operator r of the fractal is consistent with the scaling operator a of the wavelet transform by comparing equations (60) and (61). Therefore, it is feasible and inevitable to combine the wavelet and the fractal as they have similarities from the essence of the idea and the implementation method.
Cao and Yanyang, respectively, mentioned the scale fractal dimension in the literature [37, 42]. Here, their ideas are promoted, and a new idea is proposed. Reconstruct the different frequency segments subdivided from the vibration signal by wavelet packet separately and calculate the correlation dimension of each reconstructed frequency segment signal. Then the curve of wavelet fractal spectrum is drawn.
Figure 14 shows a preprocessed diesel engine cylinder head vibration signal, Figure 15 shows the structure of the wavelet packet tree, and Figure 16 shows the reconstruction signal of each note decomposed by the three-layer Daubechies wavelet packet.



Figures 17–19 show the wavelet fractal spectra of several vibration signals of a tank under three different working conditions.



It can be seen from the correlation spectra of the reconstructed signals of each node that the correlation dimension values of the 4th node and the 8th node are almost equal under the same working condition, which means the following:(1)At the [2, 0] [3, 0] nodes of the wavelet packet decomposition, which are the 0∼2.5 kHz and 0∼1.25 kHz, the correlation dimension values of the vibration signals are almost equal while the correlation dimension of other frequency parts looks cluttered.(2)The condition information of the system is mainly concentrated in the low-frequency part of the vibration signal is consistent with the result of the spectrum analysis and also conforms to the vibration mechanism of the diesel engine.(3)The correlation dimension of the 4th node and the 8th node can effectively reflect the technical condition of the diesel engine.
In summary, the correlation dimension can be used as a characteristic parameter for the quantitative evaluation of the technique condition of complex system. In order to evaluate more accurately and quantitatively, the correlation dimension can be calculated in the specific frequency segment after the vibration signal is decomposed by wavelet packet.
6. Dimension Health Degree
6.1. Correspondence between Correlation Dimension and Health Condition
The diesel engine, of which the exciting force mainly comes from the inertia force generated by the piston crank mechanism at high speed and periodic movement and the periodic gas pressure generated by the combustion of gas in the cylinder, is a reciprocating machine. Therefore, the information of the cylinder head vibration signal from the combustion chamber is most abundant, and the correlation dimension of the vibration signal can be calculated as the characteristic parameter of the diesel engine health evaluation. It can be verified by measuring the vibration speed, instantaneous speed of acceleration and deceleration, and fuel flow rate of diesel engine at the same time.
Though the real vehicle test has a higher practical value, it is impossible to complete the full-life test of the tank, and only multiple samples can be selected for test due to the limitation of the conditions. In order to study the health condition of diesel engines comprehensively, the selected test samples should cover all health conditions as much as possible. The sample selection is primarily based on motor hours and refers to the parameters of diesel engine such as the parameters of acceleration, power, and fuel consumption together. The specific method is as follows: choose 10 tanks of 100 motor hours, 200 motor hours, 300 motor hours, and 400 motor hours, respectively, and calculate their acceleration, power, and fuel consumption. Two tanks with middle parameters are selected as samples, which make the samples more representative. Therefore, there are 4 stages and 8 samples to carry out the real vehicle test. According to the age of the people, the 4 stages are defined as “infantile period,” “youth stage,” “mature stage,” and “old age stage” respectively.
The key to quantitatively evaluate the health condition of diesel engine by correlation dimension is to find out the relationship between correlation dimension and the health condition of diesel engine. The correlation dimension of the same vehicle under different working conditions and different vehicles under the same working conditions can be calculated. The corresponding relationship between the correlation dimension and the health condition of the diesel engine can be discussed by plotting working condition-dimension curve and age-dimension curve, respectively, and aggregating the two kinds of curves into holographic curves.
The selected test samples are, respectively, collected in the six following conditions: working condition 1: neutral, 500 r/min, working condition 2: neutral, 1000 r/min, working condition 3: neutral, 1500 r/min, working condition 4: 1st gear, 1500 r/min, working condition 5: 2nd gear, 1500 r/min, working condition 6: 3rd gear, 1500 r/min.
In order to reduce the measurement error, the correlation dimension D2i of vibration signals can be calculated by intercepting multisegment data and the mean value is the correlation dimension D2 of vibration signals. Table 1, according to which the holographic curves of working conditions, ages, and dimension are plotted in Figure 20, shows the correlation dimension of the final determined diesel engine cylinder vibration signal at different ages under different working conditions.

It can be seen from Figure 20 that the trend of correlation dimension of the vibration signal of the cylinder head of the tank diesel engine varies with the working condition. The correlation dimension increases with the increase of the rotating speed but there is a decline of the correlation dimension at 1500 r/min of 2nd gear, which indicates that the diesel engine is in stable working condition under this condition. In addition, under the same working conditions, the correlation dimension will change with the change of the health condition as the age of the diesel engine gradually increases. As shown in Figures 4–23, the correlation dimension is large at 100 motor hours because just like a person in the “infantile” period, the tank diesel engine, of which the running time is short, is in the running-in period and the system movement is complicated.



At 200 motor hours, the correlation dimension value drops significantly as the diesel engine is running in and the age increases into the “youth” period. Of course, the health will decline, some degree of wear and tear will occur, and the correlation dimension will gradually increase as the engine continues to work.
6.2. Dimension Health Degree
From the above analysis, it can be found that the correlation dimension D2 of cylinder head vibration signal will change when the health condition of diesel engine changes. The better the health of the diesel engine, the smaller the correlation dimension D2 of the corresponding vibration signal; otherwise, the worse the health, the larger the correlation dimension D2. Therefore, the correlation dimension can be used to define the health condition of diesel engine, which is described by the dimension health degree HD2 as the following formula:
In the above formula, D2i is the correlation dimension of diesel engine vibration signals to be evaluated. D2min is the correlation dimension of diesel engine of the best health under the same working condition. D2max is the correlation dimension of diesel engine of the worst health under the same working condition.
Dimension health degree is a quantitative measure of the health condition of diesel engine defined by correlation dimension. Equation (64) gives the corresponding relationship between the correlation dimension and the health condition.
6.3. Age Health Degree
Data in Table 1 are transformed as Figure 21. The graph shows that the trend of correlation dimension is very similar to the change of the diesel engine health when the working condition is neutral with 1500 r/min, 1st gear with 1500 r/min, and 2nd gear with 1500 r/min. The condition of 2nd gear with 2500 r/min under which the real diesel engine works more stable is chosen to fit the curve and then the curve shown in Figure 22 is obtained.
It is found from Figure 22 that when the correlation dimension y = D2i is equal to the intersection point C of the graph, the health condition of the current diesel engine can be directly judged. However, when the correlation dimension is equal to two intersection points A and B of the graph, the health condition of the diesel engine cannot be accurately judged. Figure 22 shows that the age of the diesel engine will affect the health of the diesel engine. The age health HA of the diesel engine is defined as
In the above formula, Ai is the age of the diesel engine to be evaluated and AE is the design service life for diesel engine.
6.4. Comprehensive Health Degree
Considering the correlation dimension and age condition together, the comprehensive health degree H of diesel engine is defined as
Substitution of equations (64) and (65) in equation (66) yields the health degree of diesel engine as
In the formula, the weights of dimension health a1 and age health a2, respectively, reflect the influence degree of correlation dimension and age on diesel engine health. According to the statistical analysis, assign a1 = 0.7, a2 = 0.3. In this way, the health condition of diesel engine can be quantitatively evaluated by the health degree defined in (67).
In Figure 12, the three points A, B, C are calculated respectively, and the health degree is 0.73, 0.69, and 0.48 respectively, so that the health condition of the diesel engine can be accurately assessed.
In addition, considering other factors affecting the health of diesel engines, the health degree defined in equation (14) can be extended to
In the formula, Hn is the health degree of the influencing factors and an is the weight of the health degree of the influencing factors.
7. Verification of Evaluation Results
Random selection of 2 tanks was taken as verification samples, which were taken as sample 1 and sample 2. The vibration signals of cylinder head of diesel engine were measured, respectively, and their health degree was 0.85 and 0.33, respectively. The validity of the above evaluation results and the definition of health were verified by the eigenvalues of equivalent vibration intensity, acceleration and deceleration, and fuel consumption.
7.1. Vibration Rating
Vibration rating can be defined as where Vs is the equivalent vibration intensity, mm/s; Vx, Vy, Vz are the root mean square value of vibration velocity of each specified measuring point in three directions X, Y, Z respectively, mm/s; Nx, Ny, Nz are the numbers of points measured in three directions, respectively.
The vibration quality of diesel engine is divided into four grades: A, B, C, and D according to the degree of equivalent vibration intensity C and the meaning of the vibration quality is “excellent,” “good,” “tolerant,” and “unacceptable” respectively. Table 2 shows a multicylinder diesel engine vibration quality rating table.
The calculation results of equivalent vibration intensity for samples 1 and 2 are shown in Table 3. Their vibration ratings are A and C, respectively, indicating that the vibration quality of diesel engine is “excellent” and “tolerable,” respectively, which is consistent with the evaluation of the health condition.
7.2. Verification of Characteristics of Acceleration and Deceleration and Oil Consumption
The schematic diagram of the instantaneous speed and fuel flow characteristic of diesel engine is shown in Figure 23. The reference parameter values of a diesel engine’s technical condition are shown in Table 4, which is derived from the project of Professor Liu Jianmin of The Academy of Armored Forces Engineering [43].(1)Characteristic amount of instantaneous of diesel engine Acceleration time: in the instantaneous curve, the time from the speed n1 to the speed n2 during the acceleration process is ti = t2 − t1. Deceleration time: in the instantaneous speed curve, the time from the speed n3 to the speed n4 during the deceleration process is td = t4 − t3 after stopping the fuel supply.(2)Characteristic quantity of fuel flow
Fuel consumption during acceleration is
Table 5 shows the characteristics of health, fuel consumption, and acceleration and deceleration performance of sample 1 and sample 2 diesel engines. It can be found from table that the health degree of sample 1 and sample 2 is consistent with the health condition reflected by acceleration time, deceleration time, and fuel consumption in acceleration time. This indicates that it is feasible to evaluate the health of diesel engine by using health degree, and the result is reliable.
8. Conclusion
Based on the comprehensive diesel engine operation time and the technical condition of diesel engine, the idea of evaluating the health condition of diesel engine is put forward. Through the calculation and analysis of a large number of real vehicle data, the corresponding relationship between the correlation dimension and the health condition is discussed, and the corresponding relationship between the age condition and the health condition is defined, and the age health degree HA, dimension health degree HD2, and comprehensive health degree H are defined. These parameters can be used to quantitatively evaluate the health condition of diesel engine, and it has been proved that the method is feasible and the result is reliable.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.