Abstract
The actor nodes in wireless sensor and actor network (WSAN) are responsible for receiving the perceived data, processing and collaborating with each other. In most scenarios, maintaining the connectivity of the interactor network is necessary to plan the optimal coordinated response. However, actors are vulnerable to damage due to their limited energy and harsh environment. At worst, such failure can split the interactor network, which is affecting network performance. To restore the network connectivity, the existing methods replace the failed node by selecting a redundant node in the network. Multiple nodes may be involved in moving from the redundant node to the failed node, thus forming a repair path. However, the repair paths generated by such methods are often not optimal. In this paper, we use the gradient generation and diffusion mechanism to restore the connectivity of the interactor network and propose a gradient-based distributed connectivity recovery (GDCR) algorithm. GDCR selects an optimal repair path from the global network based on the generated gradient distribution under fully distributed and localized conditions. GDCR can timely respond to the repair and minimize the recovery range and movement overhead. Simulation results verify the performance of the proposed algorithm.
1. Introduction
Wireless sensor network (WSN) is a self-organizing network system [1–3], which is composed of numerous cheap and stationary sensors deployed in the task area. The sensor nodes are used to sense and collect data about events in the target area and transmitted it to the remote base station through multiple hops [4]. However, due to the limited resources and immovability of sensors, WSN can only passively perceive the environment. Furthermore, more and more applications require network systems to interact with the surrounding environment. Therefore, wireless sensor and actor network (WSAN) came into being.
By adding few actor nodes with sufficient energy, high computing and communication capabilities, especially mobile capabilities, into WSN, this allows WSN has a good capability of control and execution [5] to better interact with the environment (i.e., WSN has become WSAN). As described above, WSAN places numerous stationary sensors and fewer mobile actors in a target area. The sensor nodes are responsible for collecting the data from the task area and sending it to the nearby actors. Then, the actor nodes analyze and process the data to perform the corresponding action.
Based on this feature of WSAN, it is usually applied to complex mission scenarios. For example, literature [6] uses a solar-powered wireless sensor network to monitor large, remote, and inaccessible areas. WSAN is used for fire monitoring to easily extinguish fires before they become uncontrollable in literature [7]. Literature [8] proposes to complete tasks such as traffic monitoring, maritime search and rescue, battlefield reconnaissance and detection, and coordinated target tracking through mobile sensor networks. Literature [9, 10] explores potential applications of WSAN, which include military, environmental, health, domestic, and other commercial fields.
In the complex applications mentioned above, actors are often required to make collaborative decisions. For instance, in maritime search and rescue, unmanned aerial vehicles (UAVs) are responsible for searching the corresponding sea area on behalf of sensor nodes. When the UAVs find a rescue target, it immediately notifies actor nodes (e.g., ships or rescue teams). The actor nodes need to make a coordinated decision immediately and select the most appropriate actor to execute the rescue operation. This requires that the actor nodes can communicate with each other and maintain a connected network at all time. However, due to limited energy and harsh environment, actors are easily damaged. At worst, this failure can split the network between actors, which is affecting their interactions. Therefore, when an actor fails, how to restore connectivity of the interactor network is critical to the effectiveness of WSAN [11–13]. Meanwhile, because of the centrality and self-organization of WSAN, centralized recovery algorithm is difficult to implement.
Therefore, the mainstream connectivity recovery methods are distributed processes that only rely on local information. For example, DCR proposed in literature [14] and HCR in literature [15] only need one-hop neighbor’s information of actors. Literature [16] proposed the DCRS algorithm, which requires two-hop neighbor’s information of actors, and is a variant of DCR. All the methods mentioned above will preselect backup nodes for critical nodes in the network. When the critical node fails, the backup node replaces the faulty node as the repair node. The backup node can also be a critical node, so selecting backup nodes is a recursive process. However, these algorithms can only take neighbor nodes as the set of candidate backups, so the picked backup is often not globally optimal. As a result, the length of the repair path is extended, thereby increasing the movement overhead caused by the recovery process. Considering the limited energy supply of actors, the above problems will cause excessive energy loss of actors, which may lead to actor failure. Meanwhile, actor failure will lead to new recovery process and more energy consumption, thus entering a vicious cycle. Therefore, the distance traveled by actors during recovery process should be minimized to reduce the harmful impact of actor failures.
Based on the above, we propose a gradient-based distributed connectivity recovery (GDCR) algorithm. First, GDCR identifies critical nodes in the network and generates stable gradient distributions in the network through gradient generation and diffusion mechanisms. Then, each critical node designates the optimal backup node according to the generated gradient distribution. Finally, the backup node needs to maintain monitoring of its primary node for possible failure. When a critical node fails, the backup node moves to the location of its primary node to restore network connectivity. GDCR can select the optimal repair path based on the generated gradient distribution, to restore network connectivity with the lowest movement overhead. In general, GDCR is a distributed algorithm, and each node in the network only needs to maintain one-hop neighbor’s information.
The following are the main contributions of this paper: (i)To our knowledge, gradient generation and diffusion mechanisms are applied in this work for the first time. Meanwhile, we design a gradient diffusion formula and a gradient-based backup node selection strategy(ii)GDCR can well deal with the cycle repair problems, which caused by the network topology as a ring. Furthermore, GDCR algorithm is also suitable for mobile robot network and mobile sensor network(iii)Simulation results show that GDCR is superior to existing algorithms in total moving distance, load balancing of node energy, average moving distance, and network coverage loss
The organizational structure of this article is as follows: Section 2 describes the system model under consideration along with the problem statement. Related work is described in Section 3. The proposed GDCR algorithm is described and analyzed in detail in Section 4. The theoretical analysis is illustrated in Section 5. Section 6 evaluates the performance of GDCR by simulation. Finally, the conclusion and future work are given in Section 7.
2. System Model and Problem Formulation
2.1. System Model
In this paper, we are primarily concerned with the interactor network and assume that the initial case is a connected network. In addition, for the convenience of analysis, it is assumed that the communication range of each actor is . The communication range () of an actor refers to the maximum transmission range of its wireless signals. The topology of the interactor network can be represented by an undirected graph , where represents the set of actors in the network, and the link . Each node in the network has a unique identity (ID) to distinguish it from other nodes. Besides, it is assumed that each node can obtain its obtained by some positioning techniques. Since GDCR is an active selective repair algorithm, each node also needs a critical node identifier (), backup node (), and the gradient value (). The critical node identifier is defined as follows:
: if the current node is identified as a critical node, . Otherwise, .
In this paper, each node just maintains an information table of its one-hop neighbors denoted as . has three columns of information, representing the , , and gradient value of ’s one-hop neighbors, respectively. This information is generated by collecting HELLO message packets sent between nodes and is maintained throughout the network operation to reflect network topology changes. All information maintained by node is shown in Table 1.
2.2. Problem Formulation
This paper focuses on network connectivity recovery after a single-actor node failure. The recovery process is mainly through the relocation of other actors in the network.
The failure of a node does not necessarily disconnect the network, depending on the importance of the failed node in the network topology. Considering the network topology shown in Figure 1, node A6 has two neighbors, A3 and A5. When A6 fails, the connection between node A3 and node A5 remains, so the entire network remains connected. However, the failure of A3 results in the network being divided into three disjoint subnetworks {A2, A4}, {A1, A8}, and {A5, A6, A7, A9, A10}.

In fact, only cut-vertex can affect network connectivity. However, GDCR does not identify the cut-vertex of the network. Identifying cut-vertex requires each node to know the topology information of the global network, which requires a very large message overhead. The critical node mentioned in this paper is actually one-hop cut-vertex, i.e., the failure of this node will only destroy the connectivity of its one-hop neighbor’s network. Figure 2 shows the difference between cut-vertex and critical node (one-hop cut-vertex). Node A in Figure 2(a) is a cut-vertex of the global network, and its failure will lead to the loss of connectivity of the entire network. Meanwhile, node F in Figure 2(b) is a critical node. The failure of F will only make the network {A, B, C, D} formed by its one-hop neighbor’s lose connectivity.

(a)

(b)
It should be noted that the cut-vertex must be a one-hop cut-vertex. However, a one-hop cut-vertex is not necessarily a cut-vertex, as shown in node F in Figure 2(b). However, the method of determining cut-vertex is costly in terms of message overhead. In WSAN, because of the limited power supply, we must rely on as little local information as possible to identify critical nodes. Literatures [17, 18] trade-off the communication overhead of collecting limited local information against the practicality of the derived conclusions. Based on this trade-off, we declare one-hop cut-vertex as the critical node. Meanwhile, GDCR selects the optimal backup node for each critical node in advance according to the generated gradient distribution. The backup node detects the failure of the primary node and perform node relocation to restore network connectivity. GDCR resolves the failure of a single node in a network and recovers it with minimal movement overhead and minimal recovery range.
3. Related Works
In recent years, network connectivity recovery has attracted extensive attention. Many studies [19–22] focus on connectivity recovery of single node failure. All existing algorithms can be divided into two categories: nonidentifying critical nodes and identifying critical nodes.
Algorithms that do not identify critical nodes do not evaluate the importance of nodes in advance. A recovery process is performed for each node failure, even if the network connectivity is not lost. The typical algorithm in this class is recovery through inward motion (RIM) [23]. When a fault occurs, RIM requires that all one-hop neighbors of the failed node F move towards node F until they are reachable to each other. In addition, any link disconnection between two nodes caused by this process will repeat the process, so the whole recovery process is a cascade movement process. As a result, RIM incurs significant movement overhead during the recovery process.
This kind of algorithm includes redundant node movement algorithm [24] and MCDS algorithm [25]. The redundant node movement algorithm calls the redundant nodes in the network to replace the faulty node in order to repair the network. MCDS selects an appropriate node to replace the faulty node according to the minimum connected dominating set of the network. But these two algorithms belong to the centralized control algorithm and require each node to know the topology information of the entire network. The topology information of the entire network can be obtained by flooding the entire network, but this will cause a lot of message transmission costs. In addition, when a node fails, algorithms that do not identify critical nodes cannot select an efficient solution according to the actual situation of the network. Moreover, this kind of algorithm has a certain delay in fault detection, which cannot meet the high real-time requirements of some scenes.
Different from the algorithm proposed above, the algorithm for identifying critical nodes first divides all nodes into critical nodes and noncritical nodes. This type of algorithm only deals with the failure of critical nodes and ignores the failure of noncritical nodes, because noncritical nodes represent redundant nodes that have no impact on network connectivity, such as leaf nodes. Due to different methods of identifying critical nodes, the scope of critical nodes is different. But the cut-vertex that affects network connectivity is included in the scope of critical nodes.
There are some connectivity recovery algorithms only for the case of cut-vertex failure, such as DARA algorithm in literature [26] and PADRA algorithm in literature [27]. DARA selects the best candidate node from the one-hop neighbors of the failed node for replacement. The selection criteria are based on the degree of the node and the distance to the failed node. However, there is no method for identifying cut-vertex in DARA. PADRA algorithm uses connected domination set to divide nodes into dominant nodes and dominated nodes and searches for the dominated node to replace the faulty dominant node. PADRA assumes that the connected dominating set in the network is known. However, the determination of the minimum connected dominating set has been proved to be NP-HARD problem, which can only be solved by heuristic algorithm.
In literature [14], a distributed partitioning detection and connectivity recovery (DCR) algorithm was proposed. DCR does not identify the cut-vertex, but identifies the critical node through the one-hop neighbor’s information. Then, each critical node selects the appropriate backup node from its one-hop neighbors. The selection policy is to select a noncritical node closest to the faulty node. If no noncritical node exists in the neighbors, select a critical node with the highest degree. Finally, move the backup node to the location of the failed node to restore network connectivity. DCR chooses one-hop neighbor nodes as candidate backup sets, so the selected backup is often not optimal.
Literature [15] proposed HCR algorithm, which is a variant of DCR. HCR chose to move the backup node to the optimal location rather than the location of the failed node, which reduced the movement overhead somewhat, but broke the original network topology. In addition, CCRA [28] and DCRS [16] proposed to use two-hop neighbors to identify critical nodes, which improved the accuracy of identifying critical nodes. But it increases the cost of message transmission, which increases the energy consumption of nodes and is not conducive to prolonging the lifetime of the network. Similar to DCR, these algorithms select backups for critical nodes and then relocate the backup node to restore network connectivity. As a result, they do not choose the optimal backup.
We introduce the gradient generation and diffusion mechanism and improve the backup selection criteria. It makes it possible to choose the global optimal backup in both distributed and local situations. In addition, compared with the existing literature, this paper not only focuses on the movement cost and message cost but also considers the energy load balancing of network nodes.
4. Gradient-Based Distributed Connectivity Restoration Algorithm
GDCR algorithm is dedicated to solve the problem of single node failure in the interactor network with distributed, localized, low latency, and low moving cost. This section describes the GDCR algorithm in detail.
4.1. Identifying Critical Nodes
GDCR algorithm firstly divides all nodes in the network into critical nodes and noncritical nodes. This process is distributed on each node. The rules for identifying critical or noncritical nodes are as follows: (i)If node is a leaf node, it is identified as a noncritical node, i.e., (ii)If the node is not a leaf node, further identification are required. Specifically, according to the communication distance () and the location information between ’s one-hop neighbors, a small network topology including only all neighbors is modeled. The network topology can be simply represented by an adjacency matrix. Finally, a DFS operation is performed on this adjacency matrix. If all nodes can be accessed, it means that all neighbor nodes can be connected to each other, i.e., node is a noncritical node. Otherwise, is considered as a critical node
For instance, after the critical node detection of the network in Figure 1, Figure 3 is obtained. Node A2 and Node A8 are noncritical because they are leaf nodes. In addition, nodes A6, A7, and A9 are also noncritical nodes. Because even without A6, its neighbors A3 and A5 remain connected. Similarly, A7 and A9 meet this requirement. However, the absence of node A5 will cause the network to be divided into two subnetworks {A7, A9, A10} and {A1, A2, A3, A4, A6, A8}. Therefore, node A5 is the critical node. Similarly, nodes A1, A3, A4, and A10 are critical nodes.

4.2. Gradient-Based Recursive Self-Selecting Backup Algorithm
After the critical nodes have been identified, the next step is to select a backup node for each critical node. To reduce the movement overhead and risk during recovery, select backup nodes as close to noncritical nodes as possible. Therefore, this paper designs a recursive self-selecting backup algorithm based on gradient. The algorithm steps are divided into two parts: gradient generation and diffusion and optimal backup node selection.
4.2.1. Gradient Generation and Diffusion Mechanism
The noncritical nodes identified in above are defined as gradient source nodes or seed nodes, and the critical nodes are defined as nongradient source nodes. In the rest of this paper, we represent the set of noncritical nodes in the network as .
For each noncritical node , its gradient value is initialized to 0, i.e., . In addition, the gradient of each critical node is initialized to infinity. Then, the gradient value is sent to all neighbor nodes through HELLO messages. Each local information interaction is an iterative process of gradient values. In each interaction of local information, the critical node will use the following gradient diffusion (1) to update its own gradient value. where represents the self-increment of the node gradient in each iteration and represents the gradient increment between neighboring nodes during the gradient diffusion process, which needs to satisfy .
The role of is reflected in such a situation: when the connectivity restoration process is completed, a previous seed node (i.e., a noncritical node) may become a critical node, and its gradient value should be updated. However, since the gradient value of this node is always 0, it cannot be updated by the gradient diffusion (1). Therefore, the gradient self-increment is used to solve this problem.
But in this paper, we stipulate that after each recovery process, each node will reidentify its identity (i.e., critical or noncritical). Then, reinitialize the gradient value of each node according to the result of to identify and then continue to apply the GDCR algorithm, which forms a good loop. Therefore, in this paper, we set to always be 0.
For GDCR, the focus is on the gradient increment between neighboring nodes (i.e., ) in the design of the diffusion process. For this reason, we redefine the meaning of gradient, so that the gradient value represents the distance to the nearest seed node (noncritical node). Therefore, we take as the distance between two adjacent nodes. The final gradient diffusion formula (2) is as follows:
The gradient is first generated by the gradient source node and then recursively diffused into the entire network through HELLO messages. When the nongradient source node (critical node) receives the message, it will execute the above formula (2) to update its gradient value. The gradient diffusion process ends when all critical nodes in the network have received their stable gradient values. Figure 4 is a diagram of a connection network composed of 10 actor nodes, and the gradient value has reached a stable value. The lower right corner of the node in the figure is the gradient value.

4.2.2. Optimal Backup Node Selection
For each critical node, we need to specify an optimal backup node for it. The advantage of preselecting the backup is that it can immediately react to the failure of the critical node, thereby reducing the time of the repair process. For this reason, when each critical node updates its gradient value, it is also needing to select a backup node ().
Based on the gradient distribution, each critical node knows the distance to the nearest noncritical node, which helps us choose the optimal backup node, thus greatly optimizing the repair path. Specifically, we designed the following gradient-based optimal backup selection rules:
Rule 1: the neighbor node that generates the gradient value of the current node is selected as the backup node (i.e., )
Rule 2: if there are multiple neighbors that satisfy rule 1, the neighbor node with the largest degree is selected as the backup node
Rule 3: if there are still multiple neighbors that meet the above rules, one of them is randomly selected as the backup node
Rule 1 can maximize the descent gradient, thereby ensuring that the noncritical nodes can be reached with the smallest moving distance. Rule 2 is inspired by the DCR algorithm [14] and aims to reduce the loss of network coverage after connectivity repair. Finally, rule 3 guarantees the uniqueness of the selected backup node. The above rules select an optimal backup node for each critical node after the gradient distribution is stable.
Because when the critical node fails, its neighbor nodes cannot communicate. Therefore, each critical node needs to send a BACKUP message to its selected backup node. The backup node monitors its primary node for possible failures according to the received BACKUP message. For the GDCR algorithm, one node may act as a backup for multiple critical nodes. Figure 5 shows the backup selected by the critical node when the gradient distribution is stable, and the arrow points to the primary node.

4.3. Failure Detection and Recovery Process
During the operation of the network, each node will periodically send HEART-BEAT messages to its backup node to indicate that it is in a normal state. The missing HEART-BEAT messages can be used to detect the state of the node. Once a node receives the BACKUP message, the node starts to use the continuously lost HEART-BEAT messages to detect the failure of its primary node. Note that the detected failure must be the failure of the critical node, because we did not prepare a backup for the noncritical node.
After detecting the failure of the primary node, the backup node will immediately start the recovery process. The recovery process may be a simple process or a complex process, depending on whether the backup node selected by the failed node is a critical node. Therefore, we divide the recovery process into the following two categories.
4.3.1. Simple Recovery
When a failure occurs, if the backup node is a noncritical node, the recovery range will be limited. The backup node will directly move to the location of the failed node and exchange local information with the new neighbor. However, this moving process may make the backup node become a new critical node. And the backup node may act as a backup for multiple critical nodes; it will also affect this critical nodes. For this problem, GDCR can solve it on its own. Specifically, after the backup node moves to a new location, it will identify its own importance again based on the location information of the new neighbors. It then triggers the reinitialization of its own gradient value, and after a finite number of gradient diffusions, the gradient distribution will be stable again. In the above process, the problem mentioned have been solved, and in the end, the backup selected by each critical node is still the optimal node in the new network.
Figure 6 provides an example of a simple recovery process. As shown in Figure 6(a), the backup node A6 has detected that its primary node A3 has failed. Figure 6(b) shows that node A6 directly replaces its primary node A3. At this time, the neighbors of A6 become A4, A5, and A1, and A6 identifies that it is a critical node according to the location information of its new neighbors. Figure 6(c) shows that after several times of gradient diffusion to stabilize the gradients of all nodes, the backup nodes of critical nodes A5 and A6 are reselected. Node A6 selects node A4 as the backup node because node A6 passes through node A4 to reach the nearest noncritical node A2. Similarly, node A5 selects node A10 as the backup node in order to reach the nearest noncritical node A7. It is worth mentioning that the total movement cost of this recovery process is the gradient value of failed node A3, 22, which is the lowest movement cost that can be achieved.

(a)

(b)

(c)
4.3.2. Cascaded Relocation
As mentioned earlier, in some special cases, the recovery process is a cascade motion process involving more than two nodes at a time. Figure 7(a) shows that the critical backup node A4 finds that its primary node A6 has failed. Figures 7(b) and 7(c) shows the cascading relocation process. As shown in Figure 7(d), after the recovery process is over, a stable gradient distribution is generated again, and some critical nodes update their backup nodes. The repair path of this restoration process is A2A4A6, and the total movement cost generated is the gradient value 32 of the failed node A6. It can also be proved that the total movement cost is the least, i.e., the repair path is the global optimal.

(a)

(b)

(c)

(d)
It is worth noting that GDCR can also handle the situation where all nodes form a ring, even if there are no noncritical nodes in the network. For a node failure that occur in this case, GDCR will choose to ignore it. Because all nodes in the network are critical nodes, their gradient values will not drop, and no node will be selected as a backup node. In fact, it is correct to ignore this failure, which avoids the occurrence of loop repairs. This is another advantage of GDCR over existing algorithms.
|
The pseudocode of GDCR is shown in Algorithm 1. First, each node will broadcast a message containing its location and ID. Each node builds a one-hop neighbor’s information table NT (lines 1-2) according to the received message. Then, each node uses its NT to determine whether it is a criteria node (lines 3-8). For noncritical nodes, set it as the gradient source node, that is, set the initial gradient value to 0. For criteria nodes, the initial gradient value is set to infinity. After that, each node will continue to broadcast a message containing its own gradient value until the gradient distribution of all nodes is stable (lines 9-13).
Choose an optimal backup node for each critical node , and notify the selected backup node to monitor its own status (lines 14-17). When the backup node A detects that its primary node F has failed or moved, it executes the recovery process (line 18). The restoration process is to move the backup node A to the location of its primary node F (line 19).
5. Algorithm Analysis
GDCR can timely respond to the repair and minimize the movement overhead. In this section, we will prove the optimality of GDCR and analyze its performance. We introduce the following theorem.
Theorem 1. The repair path generated by GDCR algorithm is the optimal repair path. In this paper, the repair path with the minimum length is the optimal repair path.
Proof. The proof of this theorem can be divided into two case.
Case 1. If the failed node , it has no backup node, i.e., there exists no repair path for node , which satisfies the definition of the optimal repair path.
Case 2. If the failed node , according to formula (2), when the gradient value is stable, the gradient value of node will reach the minimum. Then, select the optimal backup node for node according to the gradient-based optimal backup selection rules. This process will be repeated recursively until a node has been selected as the last backup node. This implies that, in each step of the recursive process, the node with the nearest distance to the seed node will be selected as the backup node, i.e., the gradient descent in each step will be maximized. So, GDCR guarantees that the selected repair path is the optimal path.
Theorem 2. The time complexity of the gradient-based optimal backup selection is , where is the number of failed node’s one-hop neighbors.
Proof. Based on the gradient-based optimal backup selection rules, the optimal backup node can be selected by traversing all neighbors of the failed node only once. Therefore, the time complexity overhead in backup selection is .
Theorem 3. The maximum distance for each backup node travels in GDCR is the communication range .
Proof. As mentioned earlier, backup nodes are selected among one-hop neighbors of a critical node. In the worst case as shown in Figure 8, every backup node needs to move distance to reach its primary node. Therefore, the maximum distance the node moves in the GDCR is the communication range .
Theorem 4. The max number of backup nodes involved in the recovery process is , where is the number of nodes.
Proof. Consider the worst case as shown in Figure 8. The failure of node will split the network into two blocks of and nodes. There are two repair paths: and . GDCR chooses the shorter repair path, i.e., the block containing nodes. Thus, the maximum number of backup nodes involved in the recovery process is . In the same way, when is an odd number, nodes will involve in recovery process is also .
Theorem 5. The total time of GDCR to restore network connectivity is , where is the number of nodes, is the number of failed node s one-hop neighbors, and is the communication range.
Proof. Firstly, GDCR proactively designates for each critical node a backup node before a node failure. According to Theorem 2, the time complexity of this process is , where is the number of failed node’s one-hop neighbors. Then, when a node fails, the backup node moves to its primary node to restore network connectivity. According to Theorem 4, at most or backup nodes participate in the recovery process. And the maximum time for a backup node to replace its primary is proportional to , as proved in Theorem 3. So, the total time of GDCR to restore connectivity is .
Theorem 6. For an interactor network topology with actor nodes, in order to generate a stable gradient distribution, the message complexity is .
Proof. In the worst case as shown in Figure 8, there is only two seed (noncritical) nodes in the network. The gradient generated by these two seed nodes will be diffused to all the critical nodes through local communications between neighboring nodes. We can find that each seed node needs steps to diffuse the gradient value to the most remote critical node. In this case, the maximum number of neighbors a node has is 2. Thus, in the worst case, the number of messages in the process of gradient diffusion will be . Therefore, the message completed of the process of gradient diffusion is .
Theorem 7. The total message complexity of GDCR is , where is the number of nodes in the network.
Proof. The total message complexity of GDCR algorithm can be divided into two parts. One is the message complexity of the gradient diffusion process, as proved in Theorem 6 is . The other is the message complexity of the backup selection process. As mentioned above, during the backup selection process, each critical node must send a BACKUP message to its selected backup node. In the worst case as shown in Figure 8, there are -2 critical nodes. Therefore, the message complexity of this process is . Hence, the total message complexity of GDCR is , i.e., .

6. Simulation Results
In order to evaluate the performance of the proposed GDCR, we choose to compare it with previous algorithms RIM and DCR and illustrate it with numerous simulation results. In this section, we will first introduce the simulation environment and performance indicators and finally give detailed results and analysis.
6.1. Simulation Setup and Performance Metrics
We design and implement the GDCR algorithm on Matlab R2018a and carry out all the simulation. At the beginning, we randomly deployed all the mobile actors in a square area of 800 m by 800 m. We use the following indicators to evaluate the performance of GDCR: (i)Total distance moved: this represents the total distance the node moved during the recovery. This index evaluates the efficiency of the recovery algorithm from the total energy loss(ii)Average travel distance: this describes the average travel distance of the nodes involved during recovery. The energy loss of each node is considered to evaluate the energy load balancing performance of the algorithm(iii)Number of relocated nodes: specifies the number of nodes used for restoration(iv)Number of exchanged messages: this metric is used to estimate the communication costs of recovery algorithms
In addition, the following parameters are used as variables for WSAN simulation experiments. (i)Number of nodes deployed on the network (). Because the area of the deployment area is fixed, this parameter indicates the density of nodes. The larger is, the stronger the network connectivity is(ii)The communication range () of the node. This parameter directly affects the network connectivity and thus affects the movement overhead during the recovery
6.2. Performance Evaluation of GDCR
Each experiment randomly generates a network topology with different parameters and . selects from set {20, 40, 60, 80, 100} and from set {50, 100, 150, 200}. For each experiment, RIM, DCR, and GDCR were executed when a random node failed. When is selected as the variable, is fixed to 100 m. When changing , is set to 40. The simulation results are shown in Table 2. We can see that GDCR outperforms RIM and DCR on every proposed metric. The experimental results are detailed and analyzed below.
Total distance moved: Figure 9 shows the total distance all the nodes moved during the recovery. It can be seen that GDCR is significantly superior to other algorithms, because every time, the repair path it chooses is the optimal one in the global network. As the two graphs in Figure 9 show, RIM grows linearly with communication range and network size, so RIM is only suitable for small networks. The performance of DCR and GDCR basically remained stable, but the total moving distance of GDCR was significantly smaller than that of DCR. This is due to the fact that DCR often does not select the optimal backup node, resulting in additional movement overhead.

(a)

(b)
Average travel distance: as mentioned earlier, the movement of nodes consumes a lot of energy, which may lead to node failure. The failure of the node will bring a new recovery process and more energy consumption, thus entering a vicious circle. Therefore, it is necessary to consider the average moving distance of nodes during the restoration process to evaluate the load balancing ability of the restoration algorithm. As shown in the two graphs in Figure 10, when the network scale is small, RIM performs better than DCR. This is because RIM will involve more nodes in the recovery process, so the moving distance of each node is less than DCR. However, as the network scale and communication range increase, RIM’s performance will get worse and worse, while DCR’s performance will generally remain stable. At the same time, it can be seen from the figure that GDCR can always maintain a lower average moving distance.

(a)

(b)
Number of relocated nodes: as shown in the two graphs in Figure 11, the performance of GDCR is still the best, followed by DCR, and finally, RIM. Since RIM will move all neighbor nodes of the failed node and the nodes that will be moved later to maintain the connection, RIM will involve many nodes in the recovery process. At the same time, DCR and GDCR both limit the recovery range and avoid cascading movement by selecting noncritical nodes as backups, so they perform better. In addition, the repair path selected by GDCR is the shortest, i.e., the recovery range is the smallest, so fewer nodes are involved.

(a)

(b)
Number of exchanged messages: communication cost is another important indicator for evaluating connectivity restoration programs. Figures 12(a) and 12(b) reports the total number of messages that need to be exchanged during the connectivity restoration process. The recovery decision in RIM is determined by all one-hop neighbors of the failed node, and message exchange will be carried out between all one-hop neighbors, so RIM generates the highest message delivery overhead. At the same time, DCR and GDCR transfer the recovery decision to the failed node to perform, thereby restricting the message exchange between the primary node and the backup node, so they both perform better than RIM.

(a)

(b)
In addition, it can be seen from Figures 12(a) and 12(b) that as the network scale and communication range increase, the message overhead generated by RIM also increases significantly. The performance of DCR and GDCR are basically the same, and the gap in communication overhead between the two can be basically ignored. But in the message exchange before failure, DCR performed better, because GDCR has one more gradient diffusion process than DCR. Although GDCR increases the communication overhead before failure, it brings better performance in other aspects.
7. Conclusion and Future Work
In this paper, a gradient-based distributed connectivity recovery algorithm is proposed to solve the problem of losing connectivity due to single node failure in WSAN. The proposed GDCR algorithm determines the critical nodes by one-hop neighbor’s information and selects the optimal backup for the critical nodes based on the generated gradient distribution. GDCR is a completely distributed, localized, stable, and efficient connectivity recovery scheme. Compared with all the existing algorithms, GDCR can achieve the optimal performance in mobile cost and recovery range and also has excellent performance in communication cost and network coverage. Simulation results confirm the effectiveness of GDCR in all evaluation indexes. In addition, due to the limitations of the simulation platform, this paper does not conduct modeling analysis on the energy cost and life of nodes, which is our future research direction.
Data Availability
The data of this paper are randomly generated by simulation software.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (no. 61902194), Natural Science Foundation of Jiangsu Province (Higher Education Institutions) (BK20170900, 19KJB520046, and 20KJA520001), Innovative and Entrepreneurial Talents Projects of Jiangsu Province, Jiangsu Planned Projects for Postdoctoral Research Funds (no. 2019K024), Six Talent Peak Projects in Jiangsu Province (JY02), Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX19_0921 and KYCX19_0906), Zhejiang Lab (2021KF0AB05), and NUPT DingShan Scholar Project and NUPTSF (NY219132).