Abstract
This study proposes a fault diagnosis method of discrete event systems on the basis of a Petri net model with partially observable transitions. Assume that the structure of the Petri net model and the initial marking are known, and the faults can be modeled by its unobservable transitions. One of the contributions of this work is the use of the structure information of Petri net to construct an online fault diagnoser which can describe the system behavior of normal or potential faults. By modeling the flow of tokens in particular places that contain fault information, the variation of tokens in these places may be calculated. The outputs and inputs of these places are determined to be enabled or not through analyzing some special structures. With the structure information, traversing all the states is not required. Furthermore, the computational complexity of the polynomial allows the model to meet real-time requirements. Another contribution of this work is to simplify the subnet model ahead of conducting the diagnostic process with the use of reduction rules. By removing some nodes that do not contain the necessary diagnostic information, the memory cost can be reduced.
1. Introduction
Fault diagnosis is a critical issue in most industrial systems when preserving the safety of equipment and human operators. This issue has been studied in numerous studies that concern discrete event systems [1–6]. Once a fault has been detected and identified, the control law needs to be modified to safely advance the operations.
A discrete event system (DES) is a dynamic system which is driven by events. When a signal of a sensor or a movement of an executive device in the system is detected, the state of DES can be transformed automatically. Nowadays, DESs exist in so many intelligent systems including robot system, manufacturing system, and transportation system.
In some previous literature, the fault diagnosis of DESs is discussed by several approaches based on models [7–10], such as automata models, which usually lead to constructing a diagnoser automaton. Moreover, the diagnoser can be applied to analyze some diagnosable properties of systems, i.e., to check whether detecting the occurrences of unobservable events associated with faults is possible by observing words with finite lengths.
Although automata models are suitable for describing DESs, the size of the system would limit their application. The models explicitly determine all possible states and thus would result in some quite large models when the size of the systems grows. To avoid enumerating system states and the consequent enlargement of states, Petri nets are exploited to address fault diagnosis given their merits of graphical structure. Most of the recent studies on fault diagnosis are based on the analysis of Petri net reachability graph [11–14], the direct properties of Petri nets [15, 16], and structural analysis of Petri nets [17, 18].
Given that the approach proposed in the present work is based on Petri nets, the past related studies are briefly recalled. Giua and Seatzu [19] presented a diagnostic approach to avoid the exhaustive enumeration of DES states and introduced basis marking and justifications, in which the markings consistent with actual observations are characterized and the set of unobservable transitions that enables the characterizations are established. Cabasino et al. [20] developed similar approaches, such as the modified basis reachability graph and the basis reachability diagnoser, to compress the construction of a state space of bounded Petri nets. Basile et al. [21] forwarded an approach for checking K-diagnosability with the use of the technique of the integer linear programming. Liu et al. [22] proposed a technique called an on-the-fly incremental diagnostic. It can be applied to analyze both the diagnosability and K-diagnosability of bounded and also live Petri nets. Ben et al. [23] used T-invariants to define the priorities of branch investigation and rapidly determine the existence of an indeterminate cycle. Hadjicostis and Verghese [24] introduced system redundancy to detect and then isolate the fault markings. Prock [25] proposed an online technique for fault diagnosis. It can monitor the number of tokens within the P-invariants. If the total number of tokens within P-invariants is changed, then the error can be detected. Ru et al. [26] presented a method to address DESs by conducting the partial observation of states and events on the basis of the transformation of partially observed Petri nets into labeled Petri nets. Dotoli et al. [12] presented an algorithm based on the definition and solution of integer linear programming problems. The algorithm was used to characterize the fault-behavior properties needed by the system to reduce online computational effort. Fabre et al. [27] proposed a net-unfolding approach to design an online asynchronous diagnoser that can avoid state explosion. However, the online computational effort of the proposed approach is high because of the online construction of the Petri net structure by means of the unfolding.
It is easy to figure out that most of the previous methods cannot avoid the state explosion problem. The computational complexities of the proposed algorithms are exponential, and they are not suitable for online use with large intelligent systems.
For avoiding the state explosion problem, an online fault diagnosis strategy is proposed in this paper on the basis of the structure information of partially observed Petri nets. A number of reduction rules are adopted to simplify the construction of special structures without having to change the diagnosability property of the system. More precisely, assuming that the structure of the Petri net model and the initial marking are known, the faults can be modeled by their unobservable transitions. Furthermore, any additional unobservable transition may be associated with system legal behavior. Then, an algorithm decides whether the system behavior is normal or has potential faults when observed sequences occur, and special paths are defined to depict the structure information of Petri nets. Furthermore, subnets that contain fault information are constructed to describe the flow situation of the tokens. Based on these tokens, some of the inputs or outputs of particular places, which are the inputs or outputs of fault transitions, are considered for enabling. Finally, the fault diagnoser determines the inherent fault behavior.
The main advantages of the present work are as follows. Compared with the work of Fabre et al. [27], our algorithm does not require offline calculations using the structure information of Petri nets. Our method is also more likely applicable than that of Fabre et al. [27] for minimizing memory cost with the reduction rules. The diagnostic algorithm in this paper is with great advantage for avoiding the state explosion especially when looking for a reasonably efficient method for online use with large intelligent systems.
The remainder of the paper is structured as follows. Section 2 provides the basic definitions and notations. Section 3 presents the special structures used to describe the inherent fault information and the reduction rules applied to simplify the model. Section 3.5 specifies the algorithm for online fault diagnosis and proposes the fault diagnoser. Section 4 gives an example of an intelligent warehouse center to verify the algorithm. Section 5 draws the conclusions.
2. Preliminaries
This section introduces the basic characteristics of Petri nets. For the detailed discussion on Petri nets, refer to [28].
2.1. Basic Petri Net Notations
Definition 1. A Petri net is a structure that can be described as , where(i)P is the finite set of places(ii)T is the finite set of transitions, and = (iii)Pre: is the pre-incidence function, where is the set of natural numbers(iv)Post: is the post-incidence function, where is the set of natural numbersThe symbol is used for the pre-set of place (transition ) and is used for the post-set of place (transition ), respectively; e.g., = .
The marking M of Petri net N is represented by -vector, in which in each place p of P, a nonnegative number of tokens M : are assigned. Then, is the number of tokens present in p with the marking M.
A Petri net system is a pair where N is a connected graph with at least one place and one transition, while is an initial marking of N. denotes the set of reachable markings of the net system .
Given the Petri net N and marking M, if and only if and , then we can say the transition is enabled in M. It is denoted as . A transition t that is enabled may be fired to derive the marking , where C = Post – Pre, and it is denoted as .
A firing sequence from M is a sequence of transitions , such that , which is denoted as . is a set of all sequences in a Petri net. An enabled sequence is denoted as , while implies that transition belongs to sequence . If and only if there exists a sequence and it satisfies , then we can say the marking is reachable from the initial marking .
Set T can be partitioned into disjointed sets of observable transitions (represented by filled sticks) and unobservable transitions (represented by empty sticks) referred to as and , respectively. An observed sequence is , where . In this paper, the fault events are supposed to be unobservable, i.e., .
Definition 2 (see. [26]). A partially observed Petri net G is a 3-tuple , where(i)N is a Petri net with n places and m transitions(ii) is the set of observable places with (iii) is the set of observable transitions with In this paper, we assume the set , which means all the places are unobserved.
2.2. Subnets and Projections
In this part we present the definitions of subnets and projections.
Definition 3 (see. [29]). Given a Petri net , is a -induced subnet on N, where , , and and are the restrictions of Pre and Post to and , respectively.
Definition 4 (see . [30]). Given a Petri net N, a path is an oriented sequence which is alternately comprised of the nodes of the Petri net N, denoted as .
Consider a subnet and a path in a Petri net. is outside of if and only if all nodes of do not belong to , denoted as .
Definition 5. Given a set of transitions and a sequence in a Petri net, is the projection of transitions in on .
Example 1. If , where are observable and are unobservable, and set , then .
2.3. Basic Assumptions
In this part, some assumptions exploited in this study are presented in advance.
Assumption 1. No cycle of unobservable transitions exists.
Assumption 2. The Petri net N that models the DES and the initial marking are known.
Assumption 3. Once a fault occurs, the system would remain to be faulty infinitely. Then, we call these faults permanent.
Assumption 1 is commonly adopted in the field of fault diagnosis of Petri net models, whereas Assumptions 2 and 3 correspond to levels of system knowledge.
3. Special Structures and Reduction Rules
In this section, we focus on some special structures of Petri net and its reduction rules.
3.1. Special Structures
Definition 6. Given a place p and a path in a Petri net model, this path is defined as an observable path of this place p, denoted as , if it satisfies the following:(i)Its beginning node (or end node) is place p and its end node (or beginning node) is an observable transition(ii)The rest of its transitions are unobservable
Definition 7. Given a place p in a Petri net N, if exists, an observed subnet of place p is denoted as which is only comprised of all .
Consider a partially observed Petri net N and a place p. Assume that the number of nodes in N is and the number of nodes in is . Obviously and the computational complexity of constructing is polynomial.
Definition 8. Given a place p in a Petri net N, a is defined as a diagnosable subnet of place p, denoted as , if it satisfies the following:(i)Any places inside cannot be connected to any transitions outside (ii)Any unobservable transitions inside cannot be the inputs of any places outside
Example 2. The yellow path is an observable path of place , namely, , as shown in Figure 1.

Example 3. Figure 2 is an observed subnet of place , namely, . It is also a .
For a fault transition , its inputs and outputs are and . If a place (or ), then we use (or ) for brief writing.

Definition 9. Given a fault transition , a path is called a fault observable path of , if it satisfies the following:(i)The beginning node of is (ii) contains
Definition 10. Given a fault transition , a path is called a fault observable path of , if it satisfies the following:(i)The beginning node of is (ii) contains
Definition 11. Given a in a Petri net N, is called a fault-diagnosable subnet of place , if it satisfies the following:(i) is a part of (ii) does not belong to
Definition 12. Given a in a Petri net N, is called a fault-diagnosable subnet of place , if it satisfies the following:(i) is a part of (ii) does not belong to
Example 4. Place is the output of fault and Figure 3 is a fault-diagnosable subnet .

3.2. Reduction Rules
The basic structures of Petri nets are and-joint, and-split, or-joint, or-split, loop, and sequence, as shown in Figure 4. Some reduction rules based on these structures were already proposed in [31, 32]. However, in the rule (i) in [31, 32], the authors did not consider the inherent fault information on the unobservable transition, and the rules (ii) and (iii) are not mentioned.

However, not all rules can preserve the diagnosability property. To overcome this limitation, a few other simple rules are proposed as shown in Figure 5.

3.2.1. Fusion of Places in Vertical (Figure 5(a))
If there exists a transition between two places that is normal and unobservable, and it has one input and output, then this transition can be omitted and two places can be merged into a new place for reducing the number of nodes in the net. The marking of the new place is the sum of the markings of two previous places.
3.2.2. Fusion of Transitions in Parallel (Figure 5(b))
If there are several transitions with the same one input and output, and they are all normal and unobservable, then these transitions can be merged into one transition that is normal and unobservable. The new initial marking stays the same.
3.2.3. Fusion of Places in Parallel (Figure 5(c))
If there are several places with the same one input and output, then these places can be merged into one place. The marking of this place is the minimal of the several ones.
Example 5. As shown in Figure 3, and are a fault transition that contains fault information. By using reduction rule i), places and are merged and transition can be suppressed. Thus, transition cannot be reduced because it contains fault information. The reduced model is shown in Figure 6.

Theorem 1. With the use of the above reduction rules, the diagnosability of the reduced Petri net is consistent with that of the initial version.
Proof. If the initial Petri net is not diagnosable, then the two sequences of and exist with the same observation. Furthermore, faults exist in but not in . After the fault occurs, can be arbitrarily long. Let a regular and also unobservable transition be contained in , and this transition can be removed with the use of some previous reduction rules. The observation of this reduced sequence is retained even removing the transition . Therefore, the system is still not diagnosable.
If the initial Petri net is diagnosable, then no two sequences of and exist with the same observation. Furthermore, faults exist in but not in . After the fault occurs, can be arbitrarily long. Let a regular and also unobservable transition be contained in , and this transition can be removed with the use of some previous reduction rules. The observation of this reduced sequence is retained (i.e., not observable). Therefore, the system is still diagnosable.
3.3. Level Functions
Definition 13. Given two transitions and in a Petri net system, if , transition is called the up-transition of , and transition is called the down-transition of .
Definition 14. Given a path , if its end node is place p, this path is called the up-observable path of p, denoted as .
Definition 15. Given a path , if its beginning node is p, this path is called the down-observable path of p, denoted as .
With the reduction rules, several unobservable transitions without fault information may be reduced. However, some unobservable transitions cannot be reduced. In reduction rule (i), if transition contains fault information, then places and cannot be merged. Thus reduction rule (i) needs to be modified. If is the output or input of a fault transition, then the new marking in the reduced model is and . If is the output or input of the fault transition, then the new marking in the reduced model is and .
Definition 16. Given a fault transition and a place p, is called the up-transition set of place p if all the transitions of satisfy .
Definition 17. Given a fault transition and a place p, is called the down-transition set of place p if all the transitions of satisfy .
Definition 18. Given a , is called the up-level function of and for all satisfiesAnd is the up-transition of t.
Definition 19. Given a , is called the up-level function of and for all satisfiesAnd is the up-transition of t.
Definition 20. Given a , is called the down-level function of and for all satisfiesAnd is the down-transition of t.
Definition 21. Given a , is called the down-level function of and for all satisfiesAnd is the down-transition of t.
Based on Definitions 13 to 21, the transitions in (or ) can therefore be classified.
3.4. Maximal Number of Flow-In and Minimal Number of Flow-Out
Definition 22 (see [19]). Given an observed sequence in a Petri net, set is called a consistent marking set of .
Definition 23. Given an observed sequence in a Petri net, if , the maximal number of flow-in of place p with is : = max, and .
Consider a place p and an observed sequence , and assume the maximal capacity of p is k. The maximal number of flow-in of p must satisfy .
Algorithm 1 is used for calculating the maximal number of flow-in of place (or ).
Algorithm 1 is proposed on the basis of the inherent structure information of the given partially observed Petri net. Subsequently, we illustrate the algorithm with the input place of fault transition . We also apply the reduction rules and calculate all the up-level functions in . The tokens flow into and the observable transitions in are fired. The level function and marking are updated. After the transitions with the new level function are fired, new transitions are once again fired. The level function and marking will stop to update only after the tokens move into . At this phase, the variation of is the maximal number of flow-in of .
Assume that the number of inputs of is . Each has nodes, where . The length of the observed sequence is . The complexity of calculating the maximal number of flow-in of place is .
|
Corollary 1. In , if contains an or-split structure, then the maximal number of flow-in of (or ) cannot be calculated using Algorithm 1.
Proof. The or-split structure comprises a single place p and multiple outputs. When the number of tokens in place p is known, we cannot know with certainty the exact times in which each output is fired, which means that the number of tokens that flow into other places cannot be calculated. Furthermore, because of the uncertain number of tokens in , the maximal number of flow-in of cannot be calculated.
Definition 24. Given an observed sequence in a Petri net, if , the minimal number of flow-out of place p with is : = min, .
Algorithm 2 is used for calculating the minimal number of flow-out of place (or ).
Algorithm 2 is proposed on the basis of the inherent structure information of the given partially observed Petri net. Subsequently, we illustrate the algorithm with the input place of fault transition . With the use of reduction rules, the down-level functions in should be calculated. For any observable transition t belonging to an observed sequence , if its firing time is less than the number of tokens in its inputs, then should be calculated. If the firing time is more than the number of tokens in the inputs, then the unobservable transitions with down-level function need to be fired to supplement the tokens in the inputs of transition t. By parity of reasoning, the unobservable transitions with highest down-level function, is fired to supplement the tokens in the inputs of t. After the observed sequence is fired, the number of supplements is the minimal number of flow-out of place .
Assume that the number of outputs of is . Each has nodes, where . The length of the observed sequence is . The complexity of calculating the minimal number of flow-out of place is .
|
Corollary 2. In , if contains an or-joint structure, then the minimal number of flow-out of (or ) cannot be calculated using Algorithm 2.
Proof. The or-joint structure comprises a single place p and multiple inputs. When the number of tokens in place p is known, we cannot know with certainty the exact times in which each input is fired, which means that the number of tokens that flow out of other places cannot be calculated. Furthermore, because of the uncertain number of tokens in , the minimal number of the flow-out cannot be calculated.
3.5. Online Fault Diagnosis
In this part, we deal with the problem of specifying a diagnoser that can detect observable transitions regardless of a normal or faulty system behavior.
Definition 25. Given an observed sequence in a Petri net N, : = max is called the maximal retention number of place p with .
Corollary 3. Given a fault transition and an observed sequence in a , , is the initial marking of the reduced model.
Proof. In , the initial number of tokens in place is when the reduction rules are applied. After all of the unobservable transitions in are fired, the maximal number of flow-in is and the new marking is M. Then, all the unobservable transitions in are fired, and the new marking is and the minimal number of flow-out is . The maximal retention number of is
Definition 26. A diagnoser is a function . For each observed sequence , the following sets hold:(i)For , the behavior of the system is normal during the observed sequence (ii)For , the behavior is ambiguous(iii)For , the behavior of the system is faulty at the observed sequence
Theorem 1. Given a fault transition and an observed sequence in a Petri net, if there exists a and a , and no or-split structure exists in and , and no or-joint structure exists in and , then the following are satisfied:(i)For existing , , which leads to (ii)For all , , , which leads to (iii)For all , , which leads to
Proof. For a fault transition , when the and the exist, the maximal number of flow-in and the minimal number of flow-out of places and may be calculated. Then, according to Corollary 3, the corresponding maximal retention numbers are and . For all places , means that the increment of tokens in equals the initial number of tokens in , and no tokens are left in with the new marking. The fault transition is therefore not enabled and faults cannot occur. Meanwhile, for all places , means that the number of flow-out is more than the sum of flow-in and the initial number, but this condition is unreasonable for a Petri net system. Therefore, a fault transition needs to occur to supplement the missing number of tokens in place . If and , then the fault transition is enabled. Although we cannot decide whether the transition is fired or not, faults may still occur.
When states are increased, the computational complexity of the fault diagnosis becomes high. The computational complexity of calculating the maximal retention number in places and determines the computational complexity of deriving the diagnoser function . Furthermore, the computational complexity of calculating the maximal number of flow-in and minimal number of flow-out was proposed before. Thus the computational complexity of deriving the function is also a polynomial, which allows the method to meet real-time requirements.
Example 7. Consider the partially observed Petri net model as described in Figure 1. and , and the fault transitions are . For the fault transition 1, its input and output are and , and the and can be derived, as shown in Figure 7. After using the modified reduction rule (i), the reduced model can be constructed as shown in Figure 8. , and . Given an observed sequence , on the basis of Algorithms 1–3, we can deriveThe fault transition must occur. In a similar manner, we can derive , , , and the fault transition must not occur.


|
4. Example
An intelligent warehouse center is provided as an example. It is used to sort and consolidate the goods in some supply chain systems. The sorting operation is performed with an automated guided vehicle (AGV) as described in Figure 9, which is similar to the conveyor network studied in [4].

The intelligent warehouse center can be described in simplified form as follows: First, the AGV will cross seven areas (for example, to ). In these areas, some different operations are designated before the AGV returns to . Then, two types of parts A and B will be transferred by the AGV from a loading station (for example, ) to the buffers () and (), respectively. After that, there is a bar code reader named R. It would distinguish between these parts and then transfers the information to the supervisor. Through activating these two switches of and , the supervisor can direct the parts toward the corresponding buffer. Once the switch is activated, a part would move toward (i.e., a part of type 1 can be detected). Similarly, once the switch is activated, a part would move toward (i.e., a part of type 2 can be detected). Finally, the AGV would move to the other areas when the process of driving a part to (or ) is finished. In the end, it would return to . When the sensors , , and are contacted, it indicates that the AGV has entered the areas , , and . In this model of the intelligent warehouse center, two different classes of faults are considered. The first one is when a part of type 1 exits the system, it exits through the buffer but not the buffer , and then, we say a fault () occurs. The second one is when the AGV starts a cycle but without any part loading, and then, we say a fault of class () occurs.
The Petri net system with the initial marking is used to model the system. The normal behaviors are represented by transitions to as shown in Figure 10 (full and black lines), whereas the faulty behaviors are denoted by the unobservable transitions and , as shown in Figure 10 (dashed and red lines). Unobservable transitions are represented by empty sticks. The significance of transitions and places is listed in Table 1.

With the use of reduction rule (i), the diagnosable subnet of this model can be derived as shown in Figure 11. The input of fault transition is place , and the output is . The input of fault transition is place , and the output is .

Consider an observed sequence . In Figure 7, the initial markings are , , , and . Transition is fired once, and , , ; , , and . , and thus, the fault must occur. Moreover, , , ; , , and . , and thus, the fault must not occur.
Consider an observed sequence . We can derive , , ; , , and . , and thus, the fault must not occur. In addition, , , ; , , and . , and the fault may occur.
Consider an observed sequence . We can derive , , ; , , and . , and thus, the fault must not occur. In addition, , , ; , , and . , and the fault must occur.
We simulate some more sequences with Algorithm 3, and partially, details of the situations are listed in Tables 2 and 3.
The structure of the Petri net is much simpler than that of the method using the reachability graph, as shown in Figure 11. The number of state markings (polynomial) is also decreased. By considering another method with basis markings, if the observable string is , then the possible occurring sequences are , , , and , the results of which are the same as our method. In the basis marking method, the larger the structure of the Petri net is, the higher the occurring sequences will be, which leads to a more complex computation. This limitation is avoided effectively in the present work because not all of the possible occurring sequences are collected; i.e., we only focus on the inputs and outputs of the fault transitions and calculate the number of tokens that flow in or out. Consequently, we can diagnose the faulty behavior in the polynomial level.
5. Conclusions and Future Work
This work addresses the problem of fault diagnosis of DESs and proposes an online diagnoser on the basis of the partially observed Petri net. The online computation is formulated on a net structure and applied with reduction rules. The method involves an observed sequence of system events, which then is used as basis for deciding online whether the system behavior is normal, faulty, or uncertain. To achieve the goals, the maximal number of flow-in and the minimal number of flow-out are introduced. By calculating the maximal and minimal numbers for some reduced subnets, the maximal retention number of places and can be obtained, which means that the number of tokens in places and can be determined after an observed sequence is fired. Subsequently, the fault transition is enabled (or not). The entire process is formulated using the appropriate reduced subnets with fault information, and it permits to reduce the computational effort to solve the fault diagnosis problem at the price of a small memory increase, which meets the real-time requirement.
Several directions are considered in our future work. First, we want to explore additional reduction rules for the fault diagnosis problem of DESs. In such a case, the structure of the Petri net can be further elaborated. In addition, we plan to explain in detail the relationship between transitions and places given the above special structure. Finally, we will likely extend the proposed method to labeled Petri nets in the nondeterminism context.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.