Abstract

Multisensor distributed dynamic programming for collaborative warning and tracking during antimissile combat serves to meet the tracking accuracy requirements of all ballistic targets in the battlefield under the circumstance of a limited total amount of sensor resources. This paper proposes a method of multisensor distributed dynamic programming for collaborative warning and tracking based on game theory. First, starting from the target tracking algorithm, according to the characteristics of antimissile multisensor combat planning, the box particle filter (BPF) theory capable of distributed filtering and inaccurate measurement is introduced. Using the flight phase characteristics of ballistic targets, a variable structure adaptive multimodel box-based particle filter tracking method is constructed. A box particle filter with the variable structure adaptive interacting multiple model (VSAIMM-BPF) is proposed. The method solves the continuous real-time tracking problem of the ballistic target in all the phases and achieves high tracking accuracy while reducing computational complexity. Then, the motion state of each ballistic target in combat is recursively evaluated by the filtering algorithm, and the calculated sensor information gain is used as a measure to obtain more or better sensor resources for the community of interest to track the corresponding ballistic target through the game. Ultimately, the method achieves distributed dynamic programming.

1. Introduction

Multisensor task planning is a hot topic in information fusion, which was proposed in 1970s, and firstly applied in military command and control [1]. Dressler et al. [2] realized the implementation of dynamic task and optimized control of data collection based on closed-loop control strategy. This can be regarded as the earlier application of multisensor task planning. Nash [3] studied the problem of sensor allocation in multitarget tracking based on multisensor. But the work of Nash [3] was carried out on single platform. Manyika [4] and Schmaedeke [5] pointed out that the detection of sensors can be described by posterior entropy, which is helpful for sensors’ optimized control. Then, the method of information-driven sensor task planning gets rapid development. The necessity and advantage of the information-driven method were claimed by Hintz and McIntyre [6].

Due to its sensitivity, there is little public report on the sensor allocation in antiballistic battle. Research on such topics mainly focuses on two aspects: (1) the allocation and optimization of detection and tracking ability, based on information theory, operation theory, and filter and intelligent optimization method; (2) target recognition and guidance technology, containing intelligent recognition, the choice of guidance method, and target indication.

Above analysis indicates that most researchers pay much attention to the application of task planning based on a specified sensor platform or a specified stage of the task. This is helpful for the research on multisensor task planning. But the battle field of antiballistic missile is full of high complexity, strong antagonism, and increasing dynamics. These characteristics may bring great challenge to the research of multisensor task planning. This motivates us to develop a new method to solve the problem of target tracking, especially for the ballistic missiles with great threat. The main contribution of this paper is the presentation of the new box-particle filter with variable structure adaptive interacting multiple model. The proposed method can improve the track accuracy of ballistic missiles. Moreover, the time complexity of the tracking is also reduced, which is helpful for the timely requirement of antimissile.

1.1. Basic Process Analysis

Distributed dynamic programming for collaborative warning and tracking is divided into warning and coordination [112]. The main concerns of both are the stability and optimization of target tracking performance. Both can use the optimal metric of certain specific characteristics, such as detection probability, intercept probability, and tracking accuracy, as the objective function, and achieve the dynamic target sensor adjustment with the distributed task planning method. Therefore, they can be incorporated into a unified framework for processing.

Regarding the combat operational process, the warning and tracking process of antimissile operations is mainly completed by space-based infrared satellites, P-band long-range warning radars, and ground/sea-based X-band multifunction phased-array radars. Early-stage warning is mainly completed by space-based infrared satellites. The secondary warning is completed by space-based infrared satellites and P-band long-range warning radar. P-band long-range warning radars and ground/sea-based X-band multifunction phased-array radars are responsible for detection and tracking and involve the entire ballistic flight. The specific tracking time-space windows are constrained by the radar deployment locations. The warning and tracking processes are integrated and involve coordination and cooperation between multiple sensors. At the same time, for the defenders, the threat of each ballistic target is deadly. If the warning is not made in time or the target is missed, the consequences are extremely serious. The warning targets have certain randomness, and the time, location, quantity, and type of occurrence are difficult to predict; therefore, all the warning tasks are randomly and dynamically generated. The countermeasures taken by the ballistic target, and the performance limitations of the system’s sensor resources, result in certain uncertainty in tracking the target. This requires the improvement of the target detection probability (warning capability) of emergent/lost targets to ensure continuous and stable tracking of the tracked ballistic target. In summary, warning and tracking, the two different target detection processes, are closely related. It is necessary to consider both the target tracking accuracy and target detection probability according to the detection state (warning or tracking) of the target and to combine them into the objective function and adaptively allocate the limited sensor resources using the adaptive optimization algorithm.

1.2. Analysis of Key Factors

Distributed warning dynamic programming and distributed tracking dynamic programming are two closely related stages of a common problem (the target is in the warning stage or tracking stage; these are the two different stages of the target detection process). This is a complex optimal combination problem. In order to achieve timely warning of new/lost targets while ensuring continuous tracking of tracked targets, it is necessary to analyze and solve all targets together; namely, the purpose of multisensor distributed dynamic programming for collaborative warning and tracking in an antimissile combat is to dynamically adjust the sensors under the constraints of limited sensor resources and the required tracking accuracy when multiple ballistic targets appear dynamically. Thus, we can complete proper detection (warning and tracking) behavior to optimize overall warning and tracking performance. Specifically, the following need to be addressed.

1.2.1. Filtering Algorithm Planning Requirements

Tracking filtering is the most basic element of a detection and tracking system [1]. In antimissile combat operations, the measured values (radial distance, pitch angle, and azimuth) by the sensors are nonlinearly related to the target state, and the tracking system is often in a situation of unknown synchronous deviation or system delay in a complex combat environment. The state noise and the measurement noise are also non-Gaussian. Therefore, this is a nonlinear filtering problem. Currently, the commonly used filtering algorithms are the extended Kalman filter (EKF) algorithm [13], the unscented Kalman filter (UKF) algorithm [14, 15], and the related improved algorithms. The improved algorithms include the point estimation algorithm, represented by the organic combination of the Kalman filter and the nonlinear system linear-approximation method, and the density estimation algorithm, represented by the particle filter (PF) algorithm and ensemble Kalman filter (EnKF) algorithm. The former can handle multivariable, time-varying systems and nonstationary random signals. However, for the problem of non-Gaussian noise, its tracking performance will be significantly deteriorated and even lead to divergence. The latter is not limited by linearization error or Gaussian noise assumption and can be used in nonlinear non-Gaussian systems; however, due to the increasing amount of computing with increasing performance, its real-time performance is poor. Therefore, it is necessary to balance the accuracy, measurement efficiency, and robustness of the algorithm in order to meet the specific requirements of the multisensor task planning system for antimissile combat and to develop an appropriate filtering algorithm.

1.2.2. Appropriate Criteria

On the basis of the filtering algorithm, it is necessary to judge, according to the merits of the current tracking, and then decide whether to carry out dynamic programming. The first step is to resolve what criteria are used to guide the process of dynamic programming:

As shown in equation (1), in terms of the mathematical model, the process of dynamic programming is as follows: for the target i, under the constraint condition find the optimal sensor set in the search space so that the value of function is maximized. is a set of sensors in the system, y represents a set of sensors conforming to the constraint of the target i, represents a cost function based on the criteria , and is the optimal sensor set to be output by the dynamic programming. For collaborative warning and tracking dynamic programming, this paper mainly considers the following factors:

(1) Detection Probability. In general, the threat of ballistic missiles is constantly changing over time. This is true both in wartime and during peacetime. The dynamics of wartime are more pronounced than in peacetime. This dynamic nature in the warning and tracking system is mainly reflected by the uncertainty of emergence and disappearance of ballistic targets. The serious threat by the targets requires that the antimissile system must handle any possible ballistic targets, and the consequence of missing targets is often unacceptable. Therefore, during antimissile combat, the antimissile system must maintain a certain level of warning to ensure timely detection of new/lost targets and provide enough warning time for interception operations. The warning capability is mainly reflected by the detection probability of the target. Each sensor’s target detection probability obeys the probability distribution of certain characteristics [16]. The detection capability can be used to calculate the coverage ability of different sensor sets in the regions where the target may appear and then to achieve timely detection of the target.

(2) Tracking Accuracy. Whether for warning or tracking operations, the main purpose is to achieve the detection and tracking of the ballistic target. The tracking accuracy determines the continuity and stability of the detection and tracking process, which is one of the key factors that need to be optimized. For antimissile combat operation, the optimization of tracking accuracy is no longer achieved by simply increasing the number of sensors. It is necessary to consider the current combat situation, such as target priority, accuracy requirements for current tasks, and possible future threats, and properly allocate an appropriate quantity of sensor resources to each target. While meeting the tracking accuracy requirements of all current tasks, as many idle sensor resources as possible are reserved to cope with possible emergencies.

(3) Target Number. In antimissile combat operations, ballistic missiles can be attacked in multiple batches, multiple directions, and multiple levels. For the sensor network, it is necessary to obtain information of multiple ballistic targets in real time, which gives rise to the multiobjective collaborative planning problem. When new targets appear, it is necessary to ensure warning of the new targets and timely allocation of the sensor resources for detection. When the target disappears/the target is lost track, it is necessary to release the relevant sensor resources in time to be used for other tasks and to coordinate other sensor resources to detect the lost target. Therefore, the dynamic change in the number of targets is also one of the key factors to consider.

1.2.3. Mathematical Methods for Effectively Solving Conflict and Cooperation Problems

From the perspective of target tracking, it is always the case that more sensors used to track targets will lead to the capture of greater amounts of information and more benefits to target tracking by enhancing the accuracy and stability of the target tracking. Meanwhile, from the perspective of the whole sensor network, the total resources for the warning and detection are limited. The more the sensor warning and detection resources are taken up by a certain target, the more limited are the sensor warning and detection resources available for other targets. Therefore, in future scenarios of large-scale, multibatch, multilevel, antisaturation attack antimissile operations, the improvement of target tracking accuracy is not the most crucial issue of task planning. How to balance the conflict between resources and objectives, under the premise of ensuring a certain degree of tracking accuracy, and the dynamic planning of limited sensor resources is the research focus of distributed dynamic programming for collaborative warning and tracking.

1.3. Planning Guidelines and Assumptions

The guidelines for multisensor distributed dynamic programming for collaborative warning and tracking mainly include the following:(1)In terms of the defender, it is necessary to ensure that each ballistic target can be detected, the warning capability (detection probability) must not be lower than the set threshold, and priority is given to tracking targets with a high threat level(2)In terms of the ballistic target, the performance of each target’s tracking must at least meet the requirements of task planning, and the time duration to track the target should be as long (stable and continuous) as possible(3)In terms of sensor tracking performance, it is preferred to use sensors with high detection accuracy to detect targets(4)In terms of the reliability and technical complexity of antimissile combat operations, it is necessary to minimize the number of sensor handovers to ensure continuous target tracking

The study of multisensor distributed dynamic programming for collaborative warning and tracking is based on the following assumptions:(1)It is assumed that the deployment location, performance parameters, and corresponding detection and tracking coverage are known for each sensor.(2)The ballistic target is predictable; that is, the initial information of the target can be predicted, and the target only has limited maneuver during the flight.(3)Based on (1) and (2), when warning for a target occurs, the visualization time window of the sensor for the target (time and spatial information of the target’s coverage by the sensor) can be approximated.(4)The handover problem limits the target transition between two sensors. During the handover, the sensor devices must coordinate the transition so that the ballistic target trajectory can be smoothly connected. This involves specific radar systems and sampling intervals. This study assumes that the sensor handover process is smooth; that is, the success rate of handover is 100%.(5)It is assumed that the information in antimissile combat can be shared in real time; that is, the communication network can support coordinated operation of multiple sensors, ignoring the problem of communication energy consumption.

2. Game Theory and Distributed Dynamic Programming for Collaborative Warning and Tracking in Antimissile Combat

A game is an interdependent decision-making situation in which more than one party participates. In general, the elements of the game model mainly include participants, information, strategy, payout (or compensation), rationality, goals, sequence of action, results, and equilibrium. Among them, participants, strategies, and payout (or profits) are the most basic elements. Participants refer to individuals who can make decisions in the game. Participants must have certain autonomy (otherwise, they are passive participants and can only act as an environmental parameter in the game process). They achieve minimizing cost (or maximizing payout) through employing strategies in the game. In this paper, the warning/tracking community of interest corresponding to each ballistic target is considered a virtual participant that participates in the game; they seek as many sensor resources as possible for each ballistic target that is part of the cooperative combat. A strategy is a participant’s possible action or a plan, and the participant acts according to their strategy. In this paper, the strategy under the collaborative warning and tracking dynamic programming problem is that each warning/tracking community of interest provides a series of possible sensor sets according to the tracking requirements of the system for the corresponding ballistic targets. Payoff/return is a common unit of measure in game theory, describing the participants’ preferences for different game outcomes. In this paper, the improvement of the tracking accuracy of the warning/tracking community of interest under certain sensor sets in the game is used as the payoff. In addition, without loss of generality, this paper assumes that each sensor can only observe one target at a time (a sensor with the ability to observe multiple targets can be considered as consisting of multiple virtual sensors, each observing a single target).

2.1. Game Payoff in Distributed Dynamic Programming for Collaborative Warning and Tracking

According to the three elements of game theory, the determination of game payoff is the key point of the whole problem. For warning and tracking, the game payoff is the ability to improve the task quality (tracking accuracy and detection probability) obtained by using the sensor sequence scheme as the target warning and tracking tasks of the ballistic target represented by the consumer. Therefore, it is first necessary to find reasonable evaluation criteria to measure the quality of the task. Obviously, this starts with the target tracking algorithm.

In the nonlinear and non-Gaussian environment, the continuous and stable tracking of ballistic targets is particularly important, and it has been a hot subject of study. In terms of the filtering algorithm, since the motion equation and radar measurement equation of ballistic targets are usually nonlinear, tracking is a nonlinear filtering problem [17] and thus requires high precision and a stable nonlinear filtering method. Currently, the filtering algorithms used for ballistic target tracking mainly include EKF, UF, PF, and related improved filtering algorithms. The literature shows that EKF has good tracking performance but has poor adaptability to nonlinearity and diverges easily [18]. UKF has better tracking performance, but it is more complicated in high-dimensional cases. PF can achieve a higher tracking accuracy, but it poses problems such as a large amount of computing, poor real-time performance, and particle degradation [1921]. In summary, a ballistic target as a non-cooperative target is of high speed, high threat, and with strict real-time requirement, thus requiring an algorithm with low computational complexity and high accuracy.

Therefore, this paper uses a generalized particle filter, that is, a box particle filter (BPF) for target tracking [1417]. Based on the phased characteristics of ballistic target flight, an interacting multimodel box particle filter is designed to solve the problem of full-phase, continuous, and stable tracking of the ballistic target. By using the BPF algorithm to achieve the tracking of the ballistic target with less cost and algorithm complexity, we strive to meet the real-time requirements of the antimissile multi-sensor task planning, in addition to ensuring desired tracking accuracy.

2.1.1. Ballistic Target Tracking Based on Variable Structure Adaptive Interacting Multimodel Box Particle Filter (VSAIMM-BPF)

Ballistic targets are usually divided into three basic phases based on their flight characteristics: the active phase, free phase, and reentry phase. Because ballistic targets in different phases are subject to significantly different forces, the dynamic characteristics are different. Therefore, most researchers have conducted a lot of research on the tracking problem of a specific stage of the ballistic target and have obtained remarkable results. However, there are few studies on the tracking ballistic targets throughout all phases. The specific limitations are as follows: (1) with strong nonlinearity, it is difficult to describe the flight process of the entire ballistic trajectory with a single mathematical model. (2) It is difficult to accurately determine the specific transition time between flight phases, and the error is significantly increased when the common tracking method is used in the transition between two different motion phases, resulting in the target being easily lost. Therefore, at present, the interacting multimodel method is mainly used to realize the tracking problem of ballistic targets at different phases [22]. Yu et al. [23] proposed that the continuous tracking algorithm can solve such problems fundamentally, and the continuous stable tracking of ballistic targets is a key problem that must be solved in task planning.

Given the above situation, this paper proposes a new IMM-BPF tracking method by integrating interacting multimodel and box particle filtering algorithms.Variable Structure Adaptive Interacting Multiple Model Tracking Method (VSAIMM).

(1) Method Description. The basic steps of the interacting multiple model algorithm are as follows: input interaction, filtering, model probability update, and output fusion. Based on the filtering algorithm, interacting multiple model algorithms can be divided into the interacting multiple model Kalman filter (IMM-KF), interacting multiple model unscented Kalman filter (IMM-UKF), and interacting multiple model particle filter (IMM-PF).

To use the interacting multimodel algorithm, determining the target’s motion model set is the key. The evaluation performance of an interacting multimodel method depends to a large extent on the set of models used. In theory, the established motion model needs to cover all the motion patterns of the target. This causes contradictions: to improve the tracking accuracy, many models are needed to fit the motion of the target, but too many models will drastically increase the computing requirements, which leads to decreased estimation performance [24]. Therefore, this section uses a variable multimodel filtering method based on the irreversible phase characteristics of ballistic target flight. The method is based on the three-phase model set of ballistic target flight; the set includes an active phase model, free phase model, and reentry phase model. A time-varying model set is adopted, and the box particle filter algorithm is adopted in each filter.

(2) Transition Probability between Model Sets. According to the definition of the flight phase of the ballistic target, the process has certain irreversibility and measurability; that is, if the ballistic target is in the free phase, it cannot be transferred to the active phase; also, if the ballistic target is in the active phase, it cannot transfer to the reentry phrase of flight, as shown in Figure 1. In Figure 1, is the probability of remaining in the active phase, is the probability of transitioning from the active phase to the free phase, is the probability of retaining in the free phase, is the probability of moving from the free phase to the reentry phase, and is the probability of remaining in the reentry phrase. Therefore, the Markov transition probability matrix based on the flight phase characteristics is obtained as follows:

(3) Variable Structure Model Set. The key point of the full-phase continuous tracking algorithm is the stable and continuous tracking performance of the ballistic target at the transition of two different flight phases. At the transition, it is difficult to determine the current flight state of the ballistic target. Different from the traditional method of judging the flight phase based on motion characteristics such as target position and speed, this section uses the current model probability to determine the current flight phase:

As shown in equation (3), represents the probability that the ith model is correctly described, and α represents a parameter setting that is generally set to 0.8. In this way, the corresponding model set at each flight phase can be obtained, as shown in Table 1.

2.1.2. Interacting Multimodel Box Particle Filter Algorithm

After determining the flight phase and the corresponding set of models, the models interact through an interacting multimodel algorithm structure. For each model, a group of box particles is generated. This group of boxes is resampled after interacting input and box particle filtering and is finally subject to interacting output. This cyclic recursive propagation updates these box particles to complete the estimation of target states.

As shown in Figure 2, the steps of the interacting multimodel box particle filter algorithm are as follows:

(1) Initialization (Newborn Box Particles). When there is a large error between the target prediction and the measurement, contracted box particles after constraint propagation will degrade rapidly. Therefore, it is necessary to initialize before the prediction. The newborn box particles are replenished according to the current measurement. The specific steps are described below.

According to the target measurement obtained at the time , the model generates a group of newborn box particles, , , where is the number of newborn box particles. At the same time, the box particles of the model resampled at time of retention are , , where is the number of persistent box particles. Then, the box particle state set of the ith model consists of the newborn box particles and the persistent box particles:

It is easy to know that the total number of box particles in the ith model is .

(2) Input Interaction. We first calculate the transition mixture probability of the model:

Then, we calculate the mixture estimate, which is the model’s interacting box particle input:

(3) Box Particle Filtering for Each Model. Predicted status:

Predicted measurement:

Interval measurement innovation:

Box particle weight calculation:

Contracted box particles:

For any box particles that satisfy (box particles that intersect with the observation area), a newborn box particle is obtained by using the CP algorithm constraint. The box particles are contracted according to the constraint algorithm described previously.

Resampling: resampling yields a newborn box particle set, .

Output:

(4) Model Probability Update. Model likelihood calculation:

Different from the standard IMM Bayesian model likelihood calculation, as shown in equation (13), the likelihood function of model j is the ratio between the sum of the box particle weights after the update and the number of box particles before sampling. Obviously, it can be seen from equation (13) that when a model conforms to the motion state of the current target, the corresponding box particle weight of the model will be larger than the box particle weight of other models; otherwise, the model cannot correctly describe the current target motion state, and the prediction bias will increase, which will cause the likelihood function to decrease. Thus, the updated model probability can be correctly calculated by equation (14) to achieve the model selection.

Model probability update:(5) Interacting Multimodel Box Particle Filtering Output:

2.2. Calculation of Game Payoff
2.2.1. Rényi Information Gain

Liu [1] pointed out that the Rényi information gain has no Gaussian limitation on the distributions of the prior probability density function and the posterior probability density function . Rényi information gain can be used to emphasize a certain local information. Therefore, it is more flexible and effective than the mutual information-based method. The Rényi information gain can thus be used to characterize the tracking performance of a sensor and be used as the game payoff. The Rényi information gain is defined as follows:where and represent the posterior probability density function and likelihood function of the sensor j to the target, respectively, and is the edge distribution function observed by the sensor j.

The prior probability density function in box particle filtering can be calculated according to

Here, Np is the number of box particles, represents the normalized weight, and represents the box as a uniform probability distribution function of the support set.

For the box measurement , there are

Substituting equations (18) and (19) into equation (17) yieldswhere indicates that the box particles are contracted by measurement, thus eliminating the excessive box particles.

Similarly, according to the law of total probability and equation (18), can be discretized as follows:

Thus, according to law of total probability and equation (21), equation (20) can be discretized as follows:

It can be seen from equation (22) that the Rényi information gain can be approximated by the sum of a set of finite samples (box particles) with weights through the box particle filtering, which avoids complicated integral operations. Liu [1] analyzed the adjustment parameter α and pointed out that when α is set to 0.5, the two probability densities can be better identified and have better tracking performance.

2.2.2. Game Payoff

Both sides of the game always want to maximize their own payoff in the game; that is, they expect to select the sensor set that brings the most Rényi information gains to track the target. Hence, the game payoff is defined as below.

Define the gain of sensor set S at the time k to the target i aswhere Ej[Iα] represents the expected value of the Rényi information gain at the current time and is obtained according towhere Nj is the number measured by the sensors.

Obviously, by increasing the number of sensors in the sensor set S, the payoff can be increased. Moreover, by replacing a sensor with a sensor with good tracking performance, both and the payoff are increased.

3. Game Theory-Based Collaborative Warning and Tracking Dynamic Programming Method

3.1. Execution Conditions

According to the target detection status, that is, new/lost target and tracked target, dynamic programming is performed under the following two execution conditions.

Execution condition 1: the target is a new target or a lost target.

If there is a new target or if the target has been lost, the dynamic programming should be performed immediately, and the sensors in the current monitored area should be selected for detection and interception. Under this condition, the real-time performance requirements for the dynamic programming are high.

Execution condition 2: the tracking performance requirement for the target cannot be met.

During the target tracking, the sensor may not be able to stably track the target due to interference, target penetration, or sensor performance limitations (line of sight, azimuth, and pitch angle). Once the target tracking accuracy is lower than the set threshold, dynamic programming should be immediately performed, and more suitable sensor nodes should be selected to increase tracking accuracy.

3.2. Game Theory-Based Collaborative Warning and Tracking Dynamic Programming

According to the analysis of the key factors presented in Section 1.2, a new/lost ballistic target is often more threatening than the currently tracked ballistic target (an unknown and understood target is often more threatening than a currently known target), and thus, sensors need to be timely assigned for detection and interception; namely, when there is a new target or the target is lost, the sensor in the vicinity of the area needs to be adjusted, and the corresponding sensors are assigned to the target to achieve timely detection. Then, the sensors that are tracking the target are adjusted to ensure that the target is continuously being tracked.

3.2.1. Determination of Warning/Tracking Community of Interest Based on New/Lost Target Detection Probability Model

(1) Detection Probability Model [1]. It is assumed that the normal velocity of the new/lost target at the boundary of the surveillance area is uniformly distributed , and m uniformly distributed particles are used to characterize the locations that may occur within the surveillance area. Thus, the distance D between the position where the new target appears and the particle satisfies the distribution is calculated in the following equation:

Here, is the sampling time of the sensor, and is the velocity of a particle. The detection probability of the sensor j at the location i represented by the particle can be calculated according to

Here, is the false alarm probability of the sensor, is the distance between the sensor j and the particle i, and is the detection signal-to-noise ratio of the sensor at . Then, at time K, the target detection probability, , of the community of interest for the target i can be calculated according to the following equation:where is a binary function indicating whether the sensor in the monitoring area detects the target, and there are N in total. indicates that the sensor detects the target; otherwise, it does not detect the target. Obviously, the sensor set inside must be a certain combination scheme for . is the sensor detection constraint. indicates that the target is within the observation coverage of sensor j; otherwise, the target is not within the observation coverage of sensor j.

(2) Determination of Community of Interest for Warning/Tracking. According to equation (27), it can be determined whether the detection probability S of a certain sensor combination of target i reaches an agreed threshold. If it does, the community of interest for target i is constructed by the combination of sensors corresponding to S. Otherwise, community of interest for target i is constructed by the sensor combination corresponding to the highest detection probability.

3.2.2. Dynamic Programming of the Target under Tracking

The main idea of the dynamic programming method based on game theory is to complete the game between different targets with the community of interest (COI) for the targets as the object. For the tracking COI of target i, when the COI is unable to meet the tracking accuracy requirements of the system for target i, the COI of other targets is negotiated to adjust the network sensor combination to form a new COI and thus achieve stable tracking of the target.

The game between the two targets is taken as an example to discuss the dynamic programming method based on game theory. Assume that before the dynamic programming target i and target j, warning/tracking communities of interest are . After the dynamic programming is performed at the moment K, target i and target j warning/tracking communities of interest are adjusted to . Then, the average gains of both parties in the game is calculated according to the following equations:

Here, is the time consumed by the game. According to the definition of game payoff, before the game, the Rényi information gain of a single sensor to the target is stable, and therefore, it is reasonable and feasible to use the average payoff over time to play the game.

If it is necessary to improve the tracking accuracy of target j at moment K, will request negotiation with . If wants to meet the demand of , it must sacrifice its own gain (the tracking accuracy of target i).

If the negotiation is successful, we can get the following equation:

and .

If the negotiation is unsuccessful, we have

and .

Clearly, for the negotiation initiator , the sooner the outcome is reached, the better the outcome will be. The opposite is true for the recipient . This forms a game. Three important definitions of game theory are now introduced [1]:

The game between the two participants in the case of successful negotiation proceeds as follows. From the gaming prospective of the negotiator (), equation (30) represents at the moment k all the game scenarios that are better than the failure of the negotiation. From the gaming perspective of the negotiation’s recipient , equation (31) shows the game plan in that results in the maximized gain/minimized gains if the negotiation is successful.

In the case of negotiation failure, from the gaming prospective of the negotiator (), the sensors involved in could be taken over to improve the tracking accuracy of target j. From the gaming perspective of the recipient (), the greatest concession by is that as long as it satisfies its own expected tracking accuracy for target i, it can provide as many sensor resources as possible. Equation (32) shows that, for at the moment k, there may be a game scheme where the game payoff is greater than that in the optimal scheme of at the moment k + 1. may be an empty set (representing the absence of such a scheme).

To summarize this, the game process at moment K between and is shown in Algorithm 1.

Step 1: according to the current sensor-target status, provides with and requests the relevant sensors in a coordinated combat operation. Under the condition of maximizing the game’s payoff with regard to the target i, chooses from . At this point, has the least payoff in the game.
Step 2: evaluates provided by . If is not satisfied (the requirement for tracking accuracy is not met), go to Step 3; otherwise, go to Step 5.
Step 3: if is an empty set, go to Step 4; otherwise, provides with a plan to maximize the game payoff for , and we proceed to Step 5.
Step 4: accepts the plan provided by .
Step 5: The game ends.
3.3. Algorithm Implementation Steps

As shown in Figure 3, the game-based collaborative warning and tracking dynamic programming method is performed as follows.

Step 1. Interacting multimodel box particle filtering.
Perform box particle filtering by taking the sensor observation value and multimodel interaction as input values and obtain the state estimation value of the relevant sensor regarding the target at time K: , .

Step 2. Execution condition decision.
Decide the execution condition. If it is execution condition 1, go to Step 3. If it is execution condition 2, go to Step 4.

Step 3. New/lost target sensor assignment.
Generate m detection particles in the region where the new/lost target may occur. Acquire a set of sensors available in the region and calculate the target detection probability of the different sensor combinations Sk () in the set. Once there is a that is greater than the set detection threshold Td, output the corresponding Sk to form the warning community of interest for the new target, and go to Step 4. If all Sk do not meet the threshold conditions, output Sk corresponding to the largest detection probability to form the warning community of interest for the new target and proceed to Step 4.

Step 4. Dynamic adjustment of sensors for tracked targets.(1)Calculate the difference between the actual variance level for the community of interest for each target and the system’s expected variance level at the moment K and sort the results in descending order (in execution condition 1, it is necessary to update the sensor set in the corresponding tracking community of interest according to the sensor set Sk of the warning community of interest and then calculate , where , and I is the total number of targets). is calculated as follows:(2)Select in and play the game with in , where the negotiation proceeds according to Algorithm 1 to obtain a new community of interest .(3)Update and sort corresponding to the target, and iteratively loop Step 2 until or the value of approaches 0.(4)At the end of the game, output the COI set for the current target: .

Step 5. Perform detection and tracking of the target according to the adjusted allocation scheme.

4. Simulation Analysis of Distributed Dynamic Programming Method for Collaborative Warning and Tracking Based on Game Theory

4.1. Experiment Scenario Setting

In order to verify the rationality and effectiveness of the method in this chapter and to achieve distributed dynamic programming for collaborative warning and tracking, the simulation scenario is set as follows. Twelve sensors are deployed. Sometime after launch, three batches of ballistic targets enter the radar surveillance area at the simulation times of 10 s, 20 s, and 30 s. The relative positions of the sensors and the ballistic targets are shown in Figure 4. The corresponding sensor position coordinates and the coordinates of launch points and landing points of the ballistic targets are shown in Tables 24. When the VSAIMM-BPF algorithm is executed, the noise interval takes the confidence interval of 99%, the interval length of the measurement , ; the number of persistent box particles is 50, the number of newborn box particles is 10, and the measurement sampling interval T = 0.5 s.

4.2. Analysis of Simulation Results

Assuming that, at the initial time of tracking (10 s), the sensor set is in the community of interest for target 1 in the network, the simulations use the method described in this chapter and a random assignment method (subjectively selecting the available radars to track the target). Special attention is paid to the target tracking performance of the sensors at the moment when the new target appears and the transition time of each target flight phase. The obtained simulation results are shown in Figure 5.

Figure 5 shows the adjusted results of the game-based distributed dynamic programming for collaborative warning and tracking used in different simulation moments (). In Figure 5, the sensor nodes within the black solid line loop denote COI1 at the current time planned for system target 1, the sensor nodes within the blue solid line loop denote COI2 at the current time planned for system target 2, and the sensor nodes within the red solid line loop denote COI3 at the current time planned for system target 3. Tables 57 show the changes in the sensor set responsible for tracking each target at the different simulation moments under the same simulation conditions and using either the method described in this chapter or the random assignment method.

Accordingly, Figures 6 and 7 show the tracking trajectory of each ballistic target in the simulation using the two methods. It can be seen that, in the time interval [t0, t2], when the new target appears, the motion state estimation of target 1 using the random assignment method disagrees on multiple occasions with the actual target motion state of the target. For the method described in this chapter, although the motion state estimation of target 1 disagrees with the actual motion state, better tracking accuracy is obtained after adjustment. The two methods are more stable for the tracking of target 2, and the estimations of their motion state are more accurate. For target 3, when the random assignment method is used, a large estimation error occurs after the target has flown for a certain period of time. Throughout the entire tracking process of each target, the method described in this chapter is significantly better than the random assignment method.

According to equation (33), the average position errors for the three targets under the two methods are calculated, as shown in Figure 8. It can be seen that, in the early and late stages of tracking the three targets of the sensor network, the overall tracking errors are greater than those in the middle stage. In the middle stage of tracking, the tracking error is small, and the target tracking effectiveness is stable. The target tracking accuracy of the method presented in this chapter is better than that of the random assignment method in general, especially in the early and late stages of target tracking. This is explained as follows. In the early stage of target tracking, the emergence of new targets takes up a certain amount of sensor resources, and the tracking error is sharply increased due to the inappropriate sensor distribution scheme with the random assignment method; moreover, as each target moves from the active phase to the free phase and although there is an adaptive adjustment of the VSAIMM-BPF algorithm, errors are unavoidable. Using the random assignment method, it is not possible to quickly and efficiently select suitable sensors for target tracking, thus making it difficult to effectively reduce local error for each target. At the same time, these local errors further affect the decision of the next random assignment, which leads to an increase and greater variation in the overall target tracking error. When there are new targets and the ballistic targets transition between flight phases and by using the real-time and accurate Rényi information gain obtained by the VSAIMM-BPF algorithm, the game theory-based method used in this chapter balances the target tracking capability and demand of the sensor nodes in the entire sensor network. The method also adaptively adjusts to the current suitable sensor set for target tracking, thereby obtaining a better overall tracking effectiveness.

Table 8 shows the statistical results of the two methods from 100 Monte Carlo simulation experiments. It can be seen that the detection probability yield of the random assignment method is slightly higher than that of the proposed method. Combining Table 7 with Figures 5 and 8 and comparing the results of the two methods at the moment when the new target appears, it can be seen that this is because the random assignment method allocates all the sensor resources in the network for the detection of the new target when the target appears. In contrast, the tracking resources reserved for the tracked targets are reduced, and the overall tracking accuracy of the system is reduced. Our method can reasonably allocate the remaining sensor resources to the target with a detection probability greater than 98%, thus improving the overall tracking performance of the system. In terms of tracking accuracy, the method described in this chapter is better than the random assignment method. It can balance the tracking demand of the system in the monitored area to dynamically adjust the sensor resources so that the system maintains continuous and stable tracking of all targets.

5. Conclusions

In this paper, we first apply the tracking filter algorithm because the existing ballistic target tracking methods in antimissile warfare are mostly based on the single-motion model and thus cannot be applied to the entire duration of ballistic target tracking. Moreover, we propose a variable structure adaptive multimodel box particle filtering tracking method. This method allows for continuous and persistent tracking of ballistic targets and reduces the computational complexity while achieving higher tracking accuracy. Then, with the Rényi information gain as the game payoff for the target detection (warning) and target tracking states, we propose a game theory-based dynamic programming method. Through the game design, it is possible to adaptively obtain more or better sensor resources for the communities of interest that need to be coordinated to detect and track the corresponding ballistic targets. Simulation experiments show that this method achieves good overall tracking performance while ensuring an adequate detection probability (warning capability).

Data Availability

The data used to support the findings of this study are included within the article. The code of the proposed method was supplied by the author Peng Ni under license and so cannot be made freely available. Requests for access to these data should be made to him via nipeng198509@163.com.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant no. 61703426), by Young Talent fund of University Association for Science and Technology in Shaanxi, China (Grant no. 20190108).