Abstract

Fault analysis is important in both research and industry. Current fault analysis tasks are mainly concerned with fault prediction and classification and do not focus enough on fault evolution mechanisms. In this paper, we propose a fault analysis method based on catastrophe theory for manufacturing system to improve the effectiveness and efficiency of real time monitoring of potential fault and causes analysis. The key advantages of our proposed method are (i) utilizing catastrophe theory and big data analysis to establish the fault cusp catastrophe model of manufacturing system and create the internal fault evolution mechanism of manufacturing system by the cusp catastrophe model and, (ii) with the established catastrophe model, fulfilling fault monitoring and accurate preventive control of the manufacturing system and ensuring the healthy operation of the manufacturing system.

1. Introduction

The modern manufacturing industry is characterized by high quality and high production efficiency, and current manufacturing systems are designed for good performance, high stability and high repeatability. Production equipment is designed to be extremely precise, efficient, and intelligent. Small performance degradation or security risksmay bring serious consequences. The global manufacturing industry loses more than 100 billion dollars every year due to quality problems caused by machine failures and other issues. It is vital to have a valid analysis approach to ensure safe operation of the equipment. Analyses of failure mechanisms and preventive control of manufacturing systems offer a good potential for future manufacturing, such as smart manufacturing.

The current method of fault analysis is shown in Table 1.

Research on fault analysis started in the 1960s. Early research regarded signal processing techniques and statistical analysis as major tools and primarily used artificial intelligence to extract fault features. Knowledge-driven methods need to establish a precise mathematical model based upon the understanding of the physical mechanism [1], parameter estimation [2], and parity spaces [3].

However, during complicated dynamic industrial processes, it is very difficult to manually build a mechanism model according to a deep insight into the system. Data-driven fault analysis rests on either an explicit mathematical model derived from prior knowledge or a reasoning mechanism derived from experience. It uses different types of data mining technology to extract and classify fault features in acquired vast operating data [4], which include signal processing [5, 6], statistical analysis [7, 8], and early quantitative artificial intelligence methods [9]. With increasing degrees of automation and intelligence of industrial equipment, along with the development and widening of application spectrum of related advanced technologies [10, 11], data began to grow exponentially. It becomes more important to process and analyze manufacturing systems in order to obtain huge diagnostic value. Hence, in recent years, value-driven method has gradually become a hot topic for researchers. Among them, deep learning is the most concerned by researchers. Deep learning [12] is good at finding complex structures in high dimensional data. It can extract fault features adaptively using enough conversions and combinations and distill the physical significance of features without manual intervention.

In conclusion, data-driven and value-driven methods are widely employed in fault analysis, which makes fast and accurate analysis possible when faults occur. In recent years, scholars mainly focus on the following aspects: (1) value-driven, which mainly includes deep learning [22], integration, and fusion [23, 24]. Zhang et al. [25] and Wang et al. [26] used deep learning in extraction and classification of fault feature. Zhao et al. [12] use deep learning in machine health monitoring; and (2) data-driven, Dong et al. [18] join data-driven fault analysis integrating causality graphs with statistical process monitoring for complex industrial processes. Guo et al. [19] used topological data analysis to extract characteristics. Weiss et al. [20] extracted key information using a combination of heuristic algorithms. Zhou et al. [21] used k-Nearest neighborhood in fault Isolation.

However, the above methods lack analysis of fault evolution mechanism capability with regard to potential faults of manufacturing system, in particular complex manufacturing system. Therefore, complex system theory was developed, where the internal mechanism of fault operation can be revealed. Unfortunately, there is little research in the existing literature on the application of catastrophe theory to fault analysis. We found that only [27] studied and proved properties of Catastrophe to finite-state systems and showed that catastrophe theory can also be used in fault injection.

In conclusion, the above research has promoted the integration of big data and intelligent manufacturing and provided a good research idea for the research on manufacturing system failures, and the research results provided a scientific basis for enterprises' production decisions. However, many of the above research methods examined the identification of fault types and prediction of fault occurrence time, which is result-oriented and fails to dig out the internal mechanism of fault evolution. Based on the above analysis, this paper proposes a method combining catastrophe theory and big data analysis to mine the internal fault evolution mechanism of manufacturing systems.

According to the method proposed in this paper, two potential scenarios are related to the analysis of failure mechanisms and preventive control of manufacturing systems: (1) after the fault occurs, the catastrophe theory is used to model the existing fault in the manufacturing system to find out the internal mechanism of fault evolution; and (2) by monitoring the system operation data, the corresponding catastrophe model is found using the abnormal data. And using established catastrophe models to dig out the causes of the failure before the failure occurs will help control the system in terms of the fault cause. As a result, the preventive maintenance of manufacturing systems will have been realized.

The main contributions of this paper are twofold as follows:

(i) We use the catastrophe model and big data analysis to establish the cusp catastrophe model of fault about manufacturing system and, by analyzing the catastrophe model, the internal mechanism of fault evolution of manufacturing system is found

(ii) According to the established catastrophe model, we realize fault monitoring and accurate preventive control of the manufacturing system and ensure the healthy operation of the manufacturing system.

The rest of this paper is organized as follows. Section 2 describes an overview of catastrophe theory analysis, together with a concept of catastrophe theory analysis based on fault analysis for manufacturing system. A simplified proof-of-concept case study is carried out to validate the proposed fault analysis process in Section 3. The research significance is illuminated in Section 4. Finally, Section 5 concludes the paper and outlines our future work.

2. Catastrophe Theory Analysis Based Fault for Manufacturing System

The catastrophe theory deals with how continuous gradual changes in nature and human society cause catastrophes or leaps and seeks to describe, predict, and control these catastrophes and leaps with uniform mathematical models.

In this paper, manufacturing system catastrophe is mainly reflected in two aspects: (1) the manufacturing system suddenly went into fault from normal state and (2) the manufacturing system suddenly returns to normal state from the fault state. The catastrophe theory researches why manufacturing system will occur with the above catastrophes and what strategies can be adopted to control the catastrophes.

Before starting the model analysis, we first define the threshold value , which indicates that the manufacturing system will fail if throughput is lower than this value.

2.1. Model and Analysis

In manufacturing systems, catastrophe theory [28] can be used to study the impact of changes in external conditions on industrial big data. In general, the catastrophe can cause irreversible damage and bring huge economic losses to the manufacturing industry. Therefore, it is of great practical value to use catastrophe theory to study manufacturing system fault. In this paper, the operation of manufacturing system is considered from both internal and external aspects. There are many internal and external factors that affect the operation of manufacturing systems, for example, machine adjustment, machine aging, improper product design, irrational production technology and process design, improper processing methods, insufficient machining accuracy, machine maintenance errors, and operational management errors. That is very hard to analyze the impact of these factors on the manufacturing system at the same time. Therefore, we introduce catastrophe models and macroscopic order parameters to reflect the real operation situation of the manufacturing system. In this paper, external macroscopic order parameters are defined in terms of production throughput, and the internal macroscopic order parameters are defined in terms of production load and duration. Based on the catastrophe theory, we plan to describe the behavior catastrophe of manufacturing system by cusp catastrophe theory. The cusp mutation model is the simplest mutation model. When the cusp mutation model is used for fault description, the critical surface is easy to construct and there is a strong geometrical intuition.

The external macroscopic order parameter and the internal macroscopic order parameters are regarded as the state variables and control variables and of the manufacturing system, respectively. And a cusp catastrophe model is established to describe the abnormal behavior of the system, by using the data of each the adjacent continues data flow interval including normal data and abnormal data. According to the opinion proposed by Hall [29], the basic model of cusp catastrophe is

where is the state variable; and are the control variables. In this paper, is duration, is production load, and is throughput. are coefficients. Therefore, the phase space is a three-dimensional space composed of state variables , , and . The critical point of the potential function is the solution of equation . Therefore, the equilibrium surface is also given by .

The corresponding nonisolated singularities set is given by

Projecting onto the parametric plane u-v, we can obtain the bifurcation set equation:

After coordinate transformation, the mutant flow and bifurcation set equations are, respectively, as follows.

Catastrophe FlowBifurcation Equationwhere . We can obtain

Equilibrium Surface. ,

Bifurcation Set. .

Where and are the coefficients. For parameters and , the optimal value can be obtained by solving the extrema method with the multivariate function: supposing we collect sets of continuous adjoining data including normal and abnormal data , put them into and , and get cumulative error:

Our goal is to find the value of for each group ; we can obtain

So we use the sum of squares to find the optimal

According to above formula and , we can obtain

By solving (10), we can get all of the extremum setswhere is the number of extreme points. Because the geometric surface of is pointing up, we can find , that is, being minimal points of .

From Figure 1, it is obvious that, on the control parameter plane u-v, if , for each , has 3 values on the surface; that is, the manufacturing system has three kinds of mode that one of three modes is theoretically reachable but practically impossible, which is unreachable. When , the two kinds of mode merge into a single one. In this case, the manufacturing system will change suddenly from one mode to the others. When , has only one value, which indicates that the manufacturing system is single mode. So is the separatrix between the single mode and the multimode of the manufacturing system, where the modal catastrophe of the manufacturing system occurs. And we found that the control parameters can only go from region to the edge as catastrophe occurs. In actual operation, the control parameters cannot be reached from the region of to the edge of . So, if the control parameters of the manufacturing system are located in , it shows that the manufacturing system is in a risky state. In this case, the control parameter can easily be changed to the edge and catastrophe occurs. In contrast, if the control parameters are located in , the manufacturing system will stay in a stable state. So, if the upper lobe represents the normal state of manufacturing system, lower lobe represents fault state, and the above analysis is followed; when , the manufacturing system has two kinds of modes: one of them is in a normal state while the other one is faulty. With the change of control parameters , manufacturing systems keep stable in one state, until , and manufacturing system will suddenly jump from upper lobe to lower lobe and breaks down, or contrary. Therefore, in order to make the manufacturing system run healthily, we should control and make ; then the purpose of preventive control is achieved.

2.2. Management Research

Through the analysis of the mutation model shown in Figure 1, we can obtain the boundary between normal and abnormal in the manufacturing system. In order to apply above research into management, we define the following related events:

(1) Event : in the operation data of manufacturing system, if the control variables , the logical value of event A is 1; otherwise, it is 0.

(2) Event : if the control variables , the logical value of event A is 1; otherwise, it is 0.

(3) Event : if the control variables , the logical value of event A is 1; otherwise, it is 0.

(4) Event D: in the operation data of manufacturing systems, if the control variables , the logical value of event A is 1; otherwise, it is 0.

Then, according to the above definition, we get the following, as shown in Table 2.

For manufacturing enterprises, the best thing is to reduce the occurrence of production failures. Its frequent occurrence will not only affect the morale of the company's employees but also lead to the stagnation of production, which will seriously affect the company's normal operation and profits. Therefore, based on the above research results, we can carry out preventive control. The control processes can be divided into three parts (Figure 2):(1) measure the actual operation of the manufacturing system according to the logic relationship established in the table; (2) judge whether the manufacturing system will suddenly fall into failure according to the catastrophe model; (3) take management actions to control parameters and prevent the system from falling.

In order to visually show the core ideas of this paper. Let us take , the change of in Figure 2 is analyzed.

First, map Figure 1 in the - plane and get the internal mechanism part shown in Figure 2.The curve 1-3 and 3-4 represent the upper branch of counter-S curve in Figure 1, and indicates the normal state of the manufacturing system, where point 3 is a critical point that the system goes from single mode to multimode. Point 4 is in the bifurcation set and represents the critical point where the manufacturing system jumps from the normal state to the fault state. The curve 4-2 represents the central part of counter-S curve in Figure 1 and represents the state that the manufacturing system, in the actual operation, cannot reach, where point 2 is in the bifurcation set and represents the critical point at which the manufacturing system jumps from the fault state to normal state. The curves 2-5 and 5-6 represent the lower branch of counter-S curve in Figure 1, and it indicates the fault state of the manufacturing system. The curve 2-5 indicates that although the manufacturing system is faulty, it can be restored to the normal state by changing the values of and , but curve 5-6 indicates the manufacturing system cannot recover itself and can only be repaired manually. Analyzing Figure 2, when , the manufacturing system state variable is located in xo1 of upper branch, and when v is increased from v1 to v4, the state variable x changes to xo4. At this time, if we added an infinitesimal perturbation to v4, the state of variable x will jump from xo4 to xo5. When v reduces to v2, the state of variable x changes to xo2. At this time, as long as v2 gets a little bit smaller, the state of variable x will jump from xo2 to xo3 in point 2 and the state of variable x enters upper branch of counter-S curve. Therefore, according to the above analysis, we can control the failure of manufacturing system. The specific process is shown in Figure 2: firstly, the operation data of the manufacturing system are collected and analyzed and, then, according to Table 2, the data will be displayed in real-time by bars in different colors. By this way, we can find the operational problem in manufacturing systems, and then it is controlled according to the internal mechanism of fault in the manufacturing system, to keep the system running in a healthy state.

3. A Simplified Case Study

A simplified proof-of-concept case is illustrated to show the process of the proposed method. Yonggu is a company that produces metal tools and has a complete IoT (Internet of things) system in its workshop. We use the IoT system to collect real-time data of the product workshop and then select the adjacent data, including normal data and abnormal data, to establish the catastrophe model.

In this section, the throughput and the production load of the manufacturing system within the duration time are selected as the monitoring parameters. In addition, it should be noted that according to the parameter estimation requirements of the catastrophe model, the data must be extracted in an interval; that is, the data in an interval must be continuous in time.

The specific way is shown in Figure 3.

3.1. Data Extraction and Modeling

Suppose that the data series of groups can be obtained in a time period:where is sampling time. In this paper, the method of clustering is used to distinguish the normal and abnormal data in the original data sequence. This method differentiates normal data and abnormal data based on the similarity of data between data (based on the distance between data points), and the effect of isolation or noise points on classification is erased using this method. Therefore, after the sampling data being preprocessed, the adjacent normal data and abnormal data of the state variable are obtained. Some sample data is shown in Figure 4.

In each interval, the data was collected continuously and composed of abnormal data and normal data. The catastrophe model was established with the data of one interval, and then the parameters of the model were revised with the data of other intervals to make the model more perfect.

3.2. Modeling and Analysis

Now we use the above data to set up catastrophe model. First, to satisfy the above equation of catastrophe flow and bifurcation set, the following equation is satisfied:

And then, according to above formula and , we can obtain

By analyzing the above model, the computational complexity is . Using above data and references [26], 10 solutions were obtained, by using . And then the optimal values and can be obtained. Next, the other intervals data can be used to modify the parameters of the model, finally we obtain and .

Now let us research on some of the dynamic behaviors of catastrophe in the manufacturing system when and .

Equilibrium Surface. .

Bifurcation Set. .

The graph of the equilibrium surface and the bifurcation set is represented in Figure 5. Let the system state be represented by in the three-dimensional space; then the phase point must be located on the surface and always on the upper lobe or lower lobe of the surface. In Figure 5, the upper lobe indicates the normal state of manufacturing system, the lower lobe indicates the fault state, and the folding part indicates an unreachable state of the manufacturing system.

The projection of the equilibrium hypersurfaces in the control plane , namely, u-v, is a topological transformation or mapping, which can be represented by

Now we get the monitor, Figure 6.

As shown in Figure 6, the first figure in Figure 6 corresponds to the path (1) in Figure 5. When the control parameters are not controlled at 9:50, then will fall on the curve of at the next moment, and the manufacturing system will suddenly break down. At this time, the manufacturing system is still able to operate, and though certain adjustments can be restored to normal operation, if the is still not controlled, will enter , and the system cannot work and manual repair is required. The second figure in Figure 6 is the result of controlling before system fault, corresponding to path (2) in Figure 5. From path (2) in Figure 5, we can know that preventing from at point 1; we can change the evolutionary path of the system and prevent the system from breaking down suddenly. The third figure in Figure 6 is the result of controlling after system fault, corresponding to path (3) in Figure 5. From path (3) in Figure 5, we can know that changing the values of and , satisfying at point 2; we can bring the manufacturing system back to normal suddenly.

Data-driven methods such as support vector machine (SVM), PCA, and spectrum analysis are based on a large number of real-time data for training, so it has requirements on the quantity of data. However, in this paper, the fault data is small sample, so data-driven methods cannot be able to adopt in this paper. In the following part, the value-driven method is used to predict and analyze manufacturing system faults.

Now, we establish 4 layers networks as show in Figure 7: the input layers have 2 neurons, the first hidden layers have 12 neurons, the second hidden layers have 6 neurons, and the output layer have 1 neuron. The elastic conjugate gradient descent method with momentum is used for network training by 368 groups of part experimental data in Figure 4. In order to be consistent with the events in Figure 6, we predict and analyze the next 10 events.

By Figure 8, we can know that, after more than 4000 times of training, the network model has achieved good results. However, by Figure 9, we can know deep learning has poor prediction effect in such case. The reason is that the fault data is a small sample, so the deep learning method is difficult to achieve good prediction effect. Moreover, it is result-oriented and does not give the internal mechanism of fault evolution.

The combination method of analysis model and data driven proposed in this paper can effectively make up the drawback that data-driven and knowledge-driven have too high requirements for fault data volume.

The above results show that the cusp catastrophe model established in this paper for manufacturing systems can excavate the internal mechanism of fault evolution and achieve the preventive control of fault.

4. Research Significance

Previous fault analysis based on data driven and value driven can predict the type of fault and the time of fault occurrence accurately; however, it is a big issue that the cause of fault and evolution mechanism cannot be found. Therefore, the evolution mechanism of fault is of great significance to the management and operation of an enterprise. In this paper, the catastrophe model of manufacturing system fault is established and then by solving and analyzing model, it is found that(1)If the control variables , the manufacturing system will stay in the normal state.(2)If the control variables , the manufacturing system will stay in the normal state. However, with the change of , the manufacturing system will break down suddenly on .(3)If the control variables , the manufacturing system will break down and require manual repair.(4)If the control variables , by changing into , we can bring the manufacturing system back to normal suddenly.

Through the above analysis, the internal mechanism of fault evolution is found and the preventive control of fault is realized.

5. Conclusion

In this paper, the cusp catastrophic model was proposed to describe manufacturing system fault and to explain fault evolution mechanism in a production process. First, the operation state of the manufacturing system is described with two internal and external macro order parameters. The external macro order parameters are taken as state variables and the internal macro order parameters as control variables and . In the process of solving model parameters and , k-mean clustering algorithm is used for data preprocessing, and then the extremum of multivariate functions is used for the optimal parameters and . Finally, the dynamics method is used to analyze the cusp catastrophe model, to find out the internal mechanism of fault evolution in the manufacturing system and to design the logic operation according to the internal mechanism of evolution, so as to realize the real-time monitoring and preventive control of the manufacturing system. Through the example verification, this method is tried out in the Yonggu Company for one year, which can shorten 128.47 hours of fault response time and reduce the loss of $33000.

Data Availability

The experimental data in Figure 4 used to support the findings of this study are belongs to Yonggu bloc in zhejiang province. Hence, we just provide part data that are included within the supplementary information file(s) (available here).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by Natural Science Foundation of China (71571072), Guangdong Provincial Natural Science Foundation Project (2018A030313079), and 2018 Guangzhou Philosophy and Social Science Development “13thFive-YearPlan” (2018GZYB16).

Supplementary Materials

Some experimental data are given in the attachment; three columns of data are represented, respectively, duration, production load, and throughput. (Supplementary Materials)