Abstract

As a NP-hard problem that needs to be solved in real time, the dynamic task allocation problem of unmanned aerial vehicle (UAV) swarm has gradually become a difficulty and hotspot in the current planning field. Aiming at the problems of poor real-time performance and low quality of the solution in the dynamic task allocation of heterogeneous UAV swarm in uncertain environment, this paper establishes a dynamic task allocation model that can meet the actual needs and uses the binary wolf pack algorithm (BWPA) to solve it, so as to propose a dynamic task allocation method of heterogeneous UAV swarm in uncertain environment. In this method, a dynamic mechanism of attacking while searching and priority attacking of important targets is designed. A dynamic task allocation model of multitarget, multitask, heterogeneous multiaircraft platform and multiconstraint is established based on the target cost-effectiveness ratio and task execution time window. In addition, one-dimensional 0–1 coding method is adopted to encode the task allocation scheme. Furthermore, the wolf pack algorithm (WPA) is introduced in brief. This paper focuses on the BWPA with the good computational robustness and strong global search ability to solve the dynamic allocation model. According to the simulation results, the designed task allocation method not only has good adaptability to the change of target and UAV number, as well as good stability and scalability, but also can effectively solve the dynamic task allocation problem of heterogeneous UAV swarm in unknown environment. Therefore, the established model and solution method can provide a useful reference for task allocation and other related problems.

1. Introduction

With better application prospects in the military field, unmanned aerial vehicle (UAV) performs reconnaissance, surveillance, target acquisition, real-time strike, and other tasks. However, it is difficult for a single UAV to meet the needs of increasingly complex tasks. Thus, the application of UAV swarm becomes more and more important, and research topic of UAV swarm task allocation is carried out. The task allocation of the UAV swarm is to conduct effective task allocation and determine the task plan of each UAV according to the task requirements, UAV characteristics, and the mission load performance, so as to give full play to the role of each UAV and improve the overall efficiency of the swarm. In line with the certainty of allocation environment, UAV swarm task allocation can be divided into two types of task allocation: “static and dynamic” [1, 2]. The swarm status in static task allocation is determined and the location of the target is known and fixed, so that the allocation is relatively simple. However, in dynamic task allocation, the target location and environments are constantly changing. Therefore, UAV swarm dynamic task allocation is not only a NP-hard problem that needs to be solved quickly [3], but also a key, difficult, and hot issue in the current task planning field [4]. At present, the research on UAV dynamic task allocation at home and abroad mainly focuses on the establishment and solution of the model.

There are a lot of research results on dynamic task allocation model. For example, Shima T [5] proposed a cooperative multiple task assignment problem (CMTAP) model for UAV. Based on [5], paper [6] improved the CMTAP model and established an ex-tended cooperative multitask assignment model, but the constraints of the model were still not fully considered. In [7], the dynamic cooperative attack process was simplified into a two-stage static WTA process, and a dynamic WTA model was proposed to better describe the complex cooperative attack problem. In addition, [8] transformed dynamic task allocation into static task allocation in multiple stages and constructed a multi-UAV dynamic reconnaissance resource allocation model, which could better improve the overall efficiency of multi-UAV performing dynamic reconnaissance tasks. Meanwhile, [9] constructed a state information description model for unmanned aerial vehicles to perform tasks, which reduced the scale, time, and communication complexity of task allocation. When adding new tasks or platform loss, [10] adopted the way of dynamic task local adjustment to build a multibase and multi-UCAV task allocation model, which improved the task allocation efficiency and enhanced the stability of the platform. Besides, [11] built up an alliance formation model but ignored the impact of communication constraints such as communication distance and communication delay. Additionally, [12] constructed a multi-UAV task allocation model based on interval information environment according to uncertain information about revenue damage cost index, target value, and range cost index.

There are many methods to solve the dynamic task allocation problem. Currently, the commonly used methods mainly include market-based algorithm and intelligent optimization algorithm. Reference [13] studied the cooperative search and simultaneous attack of UAV group on multiple targets and proposed a task allocation algorithm that can obtain the maximum overall efficiency in real-time environment. Moreover, [14] proposed a contract network algorithm based on concurrency. The algorithm can not only quickly and effectively deal with various emergencies in task execution, but also better meet the needs of dynamic task allocation. For the timing problem of task allocation, [15] proposed a genetic algorithm based on matrix coding, which can better improve the efficiency of task allocation. At the same time, [16] used the improved k-means swarming algorithm to swarm and group the “UAV task” set and then used the split particle swarm optimization algorithm to reallocate the tasks of multiple UAVs in the group. Although this strategy can obtain higher returns at the cost of reducing time, it is difficult to obtain the global optimal solution. Beyond that, [17] proposed a variable structure discrete dynamic Bayesian network reasoning algorithm for UAV task allocation in uncertain environment. Reference [18] uses dynamic game theory to decompose the dynamic task allocation problem in swarm air combat into a game problem in a single UAV and proposed a particle swarm optimization algorithm for predator and prey. It is noteworthy that this method is generally only suitable for the operation between small-scale swarms, but it is difficult to deal with the confrontation between large-scale swarms. Meanwhile, [19] combined the advantages and disadvantages of particle swarm optimization algorithm and bacterial foraging algorithm and proposed a hybrid bacterial foraging algorithm to solve the dynamic task allocation problem of multiple UCAVs.

To sum up, although a large number of research achievements have been made in the establishment and solution of the model, the following problems should still be solved: (1) The existing UAV dynamic task allocation models are ideal, easy to solve, and poorly dynamic. (2) There is little research on the fully dynamic task allocation solution strategy. Although the current dynamic task allocation adjusts the tasks on the preallocation scheme obtained in line with some target information, it is difficult to obtain information in advance in the actual task. (3) For the dynamic adjustment of large-scale UAV swarm, the solution space increases, the solution set increases exponentially, and the effectiveness of the algorithm decreases obviously. As a result, it is difficult to ensure the real-time and accurate allocation.

In order to get closer to the combat reality, a fully dynamic task allocation strategy has been adopted, and a dynamic task allocation method of UAV swarm based on BWPA is proposed. This method not only establishes a dynamic allocation model of attacking while searching and attacking on important targets first, but also uses the BWPA [20] with good computational robustness and strong global search ability to solve the model. As shown by the simulation results this method can better meet the needs of dynamic task allocation and can also provide a useful reference for solving problems in related systems such as relay networks and D2D systems [21, 22].

The structure of the paper is as follows: In Section 1, the motivation and current research status are discussed. In Section 2, a dynamic task allocation model is established. In Section 3, the novel dynamic task allocation mechanism is proposed. In Section 4, the solution method of discrete BWPA is introduced. In Section 5, the experiments are performed to evaluate the effectiveness and dynamic adaptability of the proposed dynamic task allocation method. In Section 6, the conclusion is drawn and the future work is discussed.

2. The Model of Dynamic Task Allocation

2.1. Problem Description
2.1.1. Task Allocation Scenarios and Constraints

The task allocation scenario set in this paper is UAV swarm cooperative operation; namely, in the battlefield, a heterogeneous swarm composed of search UAV (S-UAV) in number of and attacking UAV (A-UAV) in number of performs “reconnaissance attack” tasks on the unknown targets in number of to be attacked.

In the scenario of UAV task allocation, the dynamic resource allocation for static targets is mainly studied. In order to simplify the problem and build the model, the influence of communication design and data timeliness [2325] is not considered temporarily, and the following restrictions for UAV swarm operation are set:(1)The battlefield environment is a two-dimensional plane, and there are no forbidden fly zones or various obstacles.(2)There is no gray or black battlefield information, and the communication between UAVs works smoothly.(3)The UAV is regarded as a particle, and the flight altitude and turning radius of the UAV are not considered temporarily.(4)Assuming that the UAV reaches the target location, it will complete the reconnaissance or attack mission.(5)The efficiency gain is not considered temporarily when UAVs perform tasks cooperatively.(6)It is assumed that enemy targets are independent fixed targets, and the effect of targets on UAV is not considered temporarily.

2.1.2. The Description of Battlefield Environment

The battlefield environment is set as a bounded area . There are a series of static targets, the targets pose no threat to UAV, and the number and location of targets are unknown. As shown in Figure 1, the Cartesian grid recognition map is used to describe the battlefield environment, and the battlefield environment is discretized into units of number . Besides, there is only one target in each unit at most.

2.1.3. Battlefield Subject Description

UAV swarm and target to be attacked are the two main bodies of mutual confrontation in the swarm battlefield. Their respective attributes are described as follows.

In the UAV set , represents the total number of the UAVs, represents the type, and represents the total number in types, and represents the total number. This paper takes , which represents two types, respectively: reconnaissance UAV and attack UAV, reconnaissance UAV in number of NS and attack UAV in number of . UAV attributes are represented by four tuples , and the meaning of each symbol is shown in Table 1. Notably, is the maximum resource carried by the UAV at a time, and the corresponding UAV can attack targets requiring unit resources at most.

In the target set , represents the total number of targets. The attribute of the target is represented by four tuples (see the meaning of each symbol in Table 1).

2.2. Mathematical Model

In UAV swarm operation, given the overall benefits of the swarm and learning reference from common modeling methods [13], the UAV swarm task allocation model can be constructed as follows.

Assuming that the net income of UAV performing each task meets additivity, the task decision variables can be defined as follows:

In this equation, , .

Then, the mathematical model of task allocation of UAV swarm can be expressed as

In the above equation, denotes the flight range of the UAV to perform the task on the target .

The constraints are as follows:(1)Each task for each target can only be completed once by one UAV:(2)The resources carried by the UAV performing the task on each target must not be less than that of the required:(3)The total amount of resources consumed by UAV in performing tasks must not exceed its own resources:(4)The total range of each UAV during mission flight shall not exceed its maximum range:

In the above equation, represents the return voyage of UAV , and the represents the maximum voyage of UAV .

3. Dynamic Allocation Mechanism

In order to get closer to the combat reality and meet the needs of dynamic tasks, this paper adopts the mode of searching and attacking. UAV swarm completely relies on reconnaissance UAVs to obtain battlefield target information. Before fully searching the battlefield, the information of UAV swarm on battlefield targets always keeps growing. This requires the UAV swarm to make full use of the known information, so that the allocation decision can be favorable to the overall effectiveness of the swarm. As displayed in Figure 2, multiple reconnaissance UAVs search the target according to the search cycle, so as to obtain the situation information in the search cycle. The swarm system generates attack sequences according to certain rules, calls the task allocation module for task allocation, and then transmits the allocation results to the attack UAV. Apart from that, each attack UAV parses the attack command and attacks the corresponding target. If no target is detected in the search cycle and there is no previous unexecuted target, the task allocation module will not be called in this cycle.

3.1. Multimachine Slice and Area Search

As shown in Figure 3, the reconnaissance UAV divides the battlefield, searches the battlefield according to the reciprocating search method in document [26], and quickly finds the target. In order to reduce the number of turns, the UAV is designed to fly along the long side of the rectangle.

3.2. Dynamic Distribution Mechanism

In this paper, only the attack task of the target is allocated, so that the target is directly used to represent the task. In order to solve the prominent problems such as slow computing speed and poor dynamics of centralized task allocation method, a dynamic allocation mechanism is designed. To be specific, it mainly realizes the following functions: (1) selecting the appropriate task allocation time to avoid the waste of computing resources caused by frequent allocation; (2) achieving the optimization of the objective function according to the attack sequence generated by the efficiency cost ratio of each target; (3) determining the target scale of task allocation with an appropriate task window to improve the real-time performance of task allocation.

3.2.1. Allocation Timing

According to the task completion of the target in the operation process, the target is divided into three categories: the target attacked, target to be executed, and target to be assigned. The task allocation of UAV swarm is driven by time and events together, and the search process is classified into several search cycles . When reconnaissance UAV searches every cycle , it will judge whether to allocate tasks or not. If there are new targets or targets missed in the previous cycle, it will record them as target task set to be executed and then allocate it. Otherwise, there is no need to allocate. In order to ensure that each search cycle reaches any position in the battlefield to complete the attack task against the target allocated in the previous cycle, the value of search cycle should be greater than the time required for the UAV to fly the longest path, as displayed in (7).where represents the longest path among any two points on the battlefield; represents the flight speed of UAV.

3.2.2. Attack Sequence

Similar to the method in [27], most of the current dynamic task allocation methods determine the order of target execution according to the time of occurrence or the distance. This method is clear-cut. However, in the actual battlefield, the target often has certain timeliness, and the target with high value needs to be attacked first. Therefore, this paper adopts the method of “attacking while discovering, attacking first with high value” and takes the attack efficiency cost ratio as the ranking basis of target attack order. In the current search cycle, the closer it is to the swarm, the greater the value and the less resource required, and the more the priority attacks are needed. The attack cost-effectiveness ratio of the target can be defined as

In (8), the target number is , represents the sum of all resources of the UAV swarm at time , represents the nearest distance from the UAV swarm at time , represents the value of the target , and represents the resources required for the target to be destroyed.

After the current targets to be executed are arranged in descending order according to the attack cost-effectiveness ratio, the attack sequence iswhere denotes the total number of targets to be executed at the current time.

3.2.3. Task Window

For the task allocation problem with timeliness requirements, the allocation method is required to quickly respond to the newly discovered targets and generate an allocation scheme, and the implementation of the scheme can maximize the overall benefit of UAV swarm as much as possible. As shown in (10), in order to reduce the number of optimization targets, shorten the calculation time, and improve the allocation efficiency, the task window with length is used to select the first target in the attack sequence for the priority attack. When the number of targets to be attacked is less than the window length , the task to be assigned in the window is target to be attacked. When the number of targets to be attacked is greater than the window length , the tasks to be assigned in the window are the first targets in the attack sequence.

To conclude, when the reconnaissance UAV finds the target, it will not be allocated immediately, but it will be done only after the search cycle , which will cause few targets or waste of computing resources due to frequent allocation.

4. Solution of Task Allocation Problem Based on BWPA

4.1. Task Allocation Scheme Code

The decision variables of the model consist of two dimensions. The task allocation scheme formed is difficult to express intuitively, and it is not convenient to adopt intelligent optimization algorithm for scheme optimization. As shown in Figure 4, a new one-dimensional binary code is designed to represent the task allocation scheme.

In Figure 4, represents the target vector determined by the combat target task table, with a total of targets. At the same time, the allocation scheme vector corresponding to each target is , and the subscript of represents the target number, , and represents the allocation scheme of the target . If the element is , it indicates that the swarm performs an attack task on the target ; if the element is , it reveals that the swarm does not attack the target .

In order to intuitively understand the above coding method, an example is introduced for explanation. Three heterogeneous swarms of different UAVs are set to attack four targets. Then, it can be obtained that i = 1, 2, 3 and j = 1, 2, 3, 4 in the decision variable . As shown in Figure 5, the target number vector is Y, if the one-dimensional binary code of the allocation scheme is ; namely, the UAVs numbered 3, 2, 3, 1 perform attack tasks on the targets numbered 1, 2, 3, 4, respectively. Therefore, the corresponding decision variables are , , , .

Thus, one-dimensional binary code can be used to represent the meaning and allocation scheme represented by two-dimensional decision variables . This is convenient not only for visual representation, but also for algorithm optimization.

4.2. Binary Wolf Pack Algorithm

Wolf pack algorithm (WPA) [28] is a swarm intelligence optimization algorithm proposed by Hu Sheng et al. based on the two rules that “the winner is the king and the strong survives,” which further analyzes the three behaviors of “scout wolf wandering, lead wolf calling, and ferocious wolf sieging” in wolf hunting. Compared with genetic algorithm and particle swarm optimization algorithm, WPA has the advantages of small parameter sensitivity, strong robustness, fast convergence speed, and high solution accuracy. Because of its better performance, it has become an effective algorithm for solving NP-hard problems and has been widely applied in many fields [2934]. In order to apply WPA to discrete space, binary coding is introduced into WPA, and the relevant definitions are modified to obtain BWPA [20].

4.2.1. Relevant Definitions

Let the solution space be an Euclidean space , the location of wolf is , , is the number of populations, and is the coding length. Due to binary encoding, can only take 0 or 1. The odor concentration of prey where the wolf is located is the fitness value of the objective function. As shown in (11), the distance between wolf and wolf is represented by binary coded Manhattan distance.

Definition 1. Reverse: to reverse is to assign a value to in the position of wolf according to (12).

Definition 2. The kinematic operator: let the position of wolf be ; represents the set of coding bits that can be inverted and is not an empty set, namely, the movable range of wolf; represents the number of coding bits to be inverted, namely the motion step of the wolf. The kinematic operator indicates that coding bits are randomly selected in for inversion.

Example 1. If , , and , then or .

4.2.2. Description of Intelligent Behaviors and Rules

The BWPA also consists of the lead wolf generation rule “the winner is the king,” “survival of the fittest” wolf pack update mechanism, and three intelligent behaviors of “wandering, summoning, and siege.” The first two are consistent with the basic WPA [28], which will not be repeated here. The following part mainly describes the three intelligent behaviors.

(1) Wandering behavior. The wolf with the best fitness value is the lead wolf, and others are probe wolves. The fitness value of the current position of probe is calculated and recorded. The probe wolf advances one step in directions with the walking step , namely, times of motion operator , , , and is executed for the probe wolf position . At the same time, the fitness value corresponding to the position after each step is recorded, and then greedy decision is made. Let the corresponding fitness value obtained after moving in the direction be . If meets the conditions in (13), one step in the direction is advanced and the wolf detection position is updated. The above walking behavior will be repeated until , or the number of walking reaches the maximum number of walking , then turning to the calling behavior.

In (13) ; represents the fitness value of the position where the wolf is before migration; represents a random integer between .

(2) Summoning behavior. All wolves except the lead wolf are fierce wolves. The lead wolf initiates a call. The fierce wolf quickly approaches the position of the lead wolf with a large attack step according to the equation below:

in (14) is obtained according to

In (15) , the initial value of is 1, and represents a null value. represents the set of coding bits of different values at the corresponding positions of and .

When is an empty set, the fierce wolf rushes one step according to the random motion operator .

is set as the fitness value of the location of fierce wolf . If , then ; fierce wolf replaces the lead wolf and initiates a call. Otherwise, the attack continues until the conditions in (16) are met; then the call is end and turns into the siege.

In the above equation, represents the distance between the fierce wolf and the lead wolf; represents the determination distance and denotes the distance determination factor.

(3) Siege behavior. Taking the current position of the wolf as the position of the prey, the fierce wolf sieges the prey according to

In the equation , refers to the attack step size.

The wolf makes greedy decisions based on the size of the fitness values of the anterior and posterior positions.

The three steps, walking step , rushing step , and attacking step , involved in the algorithm are integers, which represent the fineness of wolf search. There are usually the following relationships:

(4) Repair mechanism. The wolves are eliminated according to the updated scale factor , and then new wolves are randomly generated. represents a random integer within . All wolves must meet relevant constraints; otherwise repeat the motion operator for repair. The set is obtained by

In the above equation, , the initial value of is 1, and represents a null value.

4.2.3. Algorithm Process

The specific steps of BWPA algorithm are as follows:

Step 1. Initialization: The number of wolf population is , the initial position of the wolf is , the maximum number of walks is , the maximum number of iterations is , the walk step is , the attack step is , and the attack step is , the distance determination factor is , and the update scale factor is .

Step 2. Selection of the lead wolf. The other wolves are probe ones and walk until or and turn to .

Step 3. The fierce wolf runs towards the prey according to (14). If , then , and the fierce wolf replaces the lead wolf and initiates a call. Otherwise, attack continues until ; then switch to .

Step 4. The fierce wolf sieges according to (17).

Step 5. Updating the wolf pack according to the wolf pack update mechanism and repairing the wolves that do not meet the conditions.

Step 6. Judging whether the termination conditions are met. If so, the optimal solution is output, that is, the wolf position and its corresponding fitness value . Otherwise, go back to .

4.3. Time Complexity Analysis

Time complexity is an important embodiment of the efficiency of the algorithm. This paper uses the same method in [35, 36] to analyze the time complexity of the solution method of task allocation problem based on BWPA.

In WPA, set the population size as and the individual dimension as . When the step size is , the update scale factor is , the maximum number of walks is , and the determination distance and other parameters have an initialization time of , a random number generation time of , and a model solution time of , the time complexity of the initialization phase is

According to the fitness value, the time for selecting the lead wolf is , the time for detecting the wolf to execute the wandering strategy is , the time for artificial wolf to execute the calling behavior is , the time required for artificial wolf to approach the position of the lead wolf in each dimension is , the time for judging whether to siege is , and the time for siege is ; then the time complexity of this stage is

The time complexity of each wolf swarm (WPA) is basically the optimal solution:

Analyze the process of the proposed method. In the initialization stage, the encoding time of the task allocation scheme is , the decoding time of the task allocation scheme is , and the other generation parameters, dimensions and model solutions are the same as those of WPA. Then the time complexity of the initialization phase of the proposed method is

In addition, the selection of the lead wolf and the execution of intelligent behaviors such as wandering, summoning, and siege are the same as the WPA algorithm, so the time complexity of this stage is

Therefore, the total time complexity of each iteration of the proposed method is

To sum up, compared with WPA, the time complexity of the proposed method has not changed, and the operation efficiency of the algorithm has not been reduced.

5. Numerical Simulation and Analysis

The performance of the proposed dynamic task allocation method is verified by two sets of examples. The simulation environment is HONOR Magic Book Pro, Windows 10, Intel Core i5-10210U, and the program is implemented in Matlab R2019b, m language. Parameter setting: the battlefield area is . Reconnaissance UAV: the flight speed is , the reconnaissance radius is , and the maximum flight range is . Attack UAV: the flight speed is and the maximum flight range is . The first group of experiments is to test the effectiveness and feasibility of the algorithm. In addition, the UAV swarm composed of 2 reconnaissance UAVs and 6 attack UAVs is used to perform reconnaissance attack tasks on 20 ground targets. The second group of experiment is the stability and robustness test of the algorithm. Based on Monte Carlo method, the impact of the change of UAV and target number on the performance of the algorithm is analyzed, and the performance of the algorithm is verified when the UAV swarm scale and the number of targets are large.

5.1. Effectiveness and Feasibility Test

In order to test the effectiveness and feasibility of the proposed method, a UAV swarm composed of two reconnaissance UAVs and six attack UAVs has been used to perform reconnaissance attack tasks on 20 ground targets. At the initial time, the specific parameter settings of UAVs and targets in the swarm are shown in Tables 2 and 3.

In the heterogeneous UAV swarm, two S-UAVs search targets in the 2000×1800 range mission environment according to the multimachine partition and zoning mechanism, and the results are shown in Figure 6. It can be seen intuitively from Figure 6 that, within 8 time periods, 20 targets are detected by and , respectively, as time goes by.

The BWPA is used to solve the above swarm dynamic task allocation problem. Set the number of wolves , the maximum number of iterations , the maximum number of walks , the decision distance , the walking step , the running step , the attack step , and the update scale factor is 5. As shown in Table 4, we can directly obtain the target number vector ; then the position can be expressed as code , , , , , and . By solving the above problems through BWPA, the variation curve of UAV swarm travel (objective function value) with time is displayed in Figure 7.

As shown in Figure 7, with the increase of task execution time, the total travel of UAV swarm continues to increase in a broken line. After 8 task cycles, the task is completed, so that the total travel of UAV swarm is 510.4 km. It is consistent with the reality of swarm task execution, indicating that the proposed method is feasible to solve the dynamic task allocation problem.

Figures 6 and 8 visually show the whole process of task allocation, coordination, cooperation, and execution of heterogeneous UAV swarm starting from the base after 8 time cycles. From the figure, you can master the position, moving track, assigned task, task execution status, and other information of UAV in the task environment within 2000×1800 at any time. When , is detected by . According to the execution cost-effectiveness ratio of each UAV, executes it, and the other UAVs turn into standby state. When , , , and are detected by and is detected by . , , , and are allocated to , , , and , respectively, according to the cost-effectiveness ratio of each A-UAV. When , and are detected by in turn, and is detected by . Since the cost-effectiveness ratio of , , and is better than that of , , , and are allocated to , , and , respectively, and turns into standby state. When , is found by , and the efficiency cost ratio of executed by is significantly better than that of other A-UAVs. Therefore, executes , and the other A-UAVs are in a standby state. When , and are found by . According to the allocation mechanism based on the implementation efficiency cost ratio, and are allocated by and . When , is detected by , and and are detected by and then executed by , , and in the swarm, respectively. When , is detected by , and and are detected by in turn and then executed by , , and in the swarm, respectively. When , and are detected by , and is detected by and then executed by , , and in the swarm, respectively. In the whole task execution process, the characteristics of UAV are fully utilized. The task allocation is coherent and the task execution process is compact. There is no spare time waiting for UAV during the execution of each task.

Figure 9 shows the resource changes of each UAV within 8 units of time during which the UAV swarm performs tasks. It is obvious that the task load distribution of UAV is reasonable, the resource consumption is relatively balanced, and the work does not interfere with each other. That is to say, the proposed method improves the resource utilization rate of UAV, gives full play to the advantages of UAV collaborative work, realizes the complementarity of capabilities and resources, and helps maintain the long-term execution ability to deal with emergencies.

To sum up, when the total resources carried by the UAV meet the needs of the current task, in the face of the searched target, the UAV swarm can comprehensively analyze the factors such as the number of resources owned by the UAV, task distance, task value, and resources required by the task according to the current situation, timely efficiently allocate the task to each UAV, execute it quickly, and realize full dynamic task allocation. Apart from that, it fully reflects the high coordination of multiple UAVs, ensures the maximum net benefit of executing tasks (i.e., maximizing the benefit of executing tasks and minimizing the cost of executing tasks), and reflects the rationality of this allocation method.

5.2. Stability and Robustness Test

In order to further test the stability and robustness of the proposed algorithm, Monte Carlo experimental method is used to set the number of tasks as 10, 15, and 20, respectively, and the number of UAVs in the cluster as 6, 10, 12, and 15, respectively. The ratio of S-UAV and A-UAV in UAV swarm is shown in Table 4. Beyond that, the relevant parameters of the task are the same as those in the previous section. The tasks of the four groups are , , and , respectively. The swarm parameters are displayed in Table 5. The swarm members of the four groups are and , and , and , and and , respectively.

The method proposed in this paper, together with binary particle swarm optimization (BPSO) [37], binary artificial fish swarm algorithm (BASFA) [38, 39], and genetic algorithm (GA) [5, 40], respectively, runs the above grouping experiments 50 times independently, and the average track cost (ATC) and the average total calculation time (ATCT) are counted. ATC is the average value of the UAV trajectory where all targets have been destroyed. ATCT is the average of the time taken for all targets to be destroyed. Simulation termination condition: all tasks are completed (i.e., all task resource requirements are 0). The four solution methods choose the same general parameters. For example: the population size is (wolf swarm, particle swarm, fish swarm, etc.); the maximum number of iterations is . The other parameters are set according to [37], [38], and [40], as shown in Table 6.

Through simulation experiments, the ACT results of the four methods are displayed in Figures 10(a) ∼10(c).

As shown in Figure 10, when the target number is 5, no matter how the swarm size changes, the four methods achieve the same results. In other words, when the number of targets is small, the size of the swarm has less impact on the four methods to solve ACT, and the established model has good adaptability to the algorithm. When the target number is 10, the change in swarm size has a particularly significant impact on the ACT required by BPSO and GA, but the proposed method and BASFA are still unaffected. When the target number is 15, the four methods are affected to varying degrees, but the size fluctuation range of the ACT of the proposed method is small, which is significantly better than that of other methods. The results show that the proposed method has good robustness and strong adaptability to the environment. In general, the number of UAVs has no significant impact on the ACT of the proposed algorithm. The proposed algorithm can achieve a relatively ideal ACT in various situations and show good robustness. The results of the proposed model by other algorithms fully reflect the good adaptability of the designed model.

As shown in Figure 11, when the number of UAVs is constant, the ATCT of the four algorithms increases as the number of targets increases. When the number of targets is constant, the ATCT increases with the change in the number of UAVs. The change trend of ATCT corresponding to the four methods in Figures 11(a) ∼ 11(c) is basically the same. When the number of UAVs is 9, the ATCT increment of the four methods is the largest, but the ATCT of BWPA is always better than that of other methods. It shows that the proposed method has high solution efficiency and better dynamic adaptability than other methods. When the target number is the same as the swarm size, the ATCT of other methods is about twice or even greater than that of BWPA. In contrast, BWPA shows better robustness. In general, the proposed method can achieve the best ACTC results in various situations and has good real-time performance.

In summary, through comparative experiments, it can be seen that the proposed model can use most algorithms to obtain satisfactory results, showing the applicability of the proposed model. As far as ACT and ATCT are concerned, the proposed algorithm performs better than the comparison algorithm in terms of UAV task load and resource utilization balance, calculation speed, and solution efficiency. The impact of UAV and the number of targets on the algorithm is not significant, indicating that the method has good stability, dynamic adaptability, and scalability and can better meet the needs of dynamic allocation.

6. Conclusion

This paper investigates the dynamic task allocation of UAV swarm. According to the real-time requirements of operation, a dynamic task allocation model of multiobjective, multitask, heterogeneous multiaircraft platform and multiconstraints has been designed based on target cost-effectiveness ratio and task execution time window. Apart from that, the designed model adopts the mode of attacking while searching and giving priority to important targets. The information acquisition of battlefield targets completely depends on the reconnaissance UAV to search targets based on the search strategy and generates the task sequence to be attacked in each task cycle in line with the cost-effectiveness ratio and task execution time window. Furthermore, the task allocation scheme adopts one-dimensional 0–1 coding, and then the dynamic allocation model is solved on account of BWPA. Finally, two groups of simulation experiments are used to verify the effectiveness and dynamic adaptability of the proposed method. According to the results, the designed task allocation method not only has the good adaptability to the change of target and UAV number, as well as good stability and scalability, but also can effectively solve the dynamic task allocation problem of heterogeneous UAV swarm in unknown environment. Therefore, the established model and solution method can provide a useful reference for task allocation and other related problems. However, this paper only considers the situation when the target is static but fails to consider the constraints such as the threat of the target, which results in some limitations. Thus, the dynamic task allocation of UAV swarm with constraints such as dynamic targets and target threats will be studied further.

Data Availability

The data used to support the findings of this study are included in the article and are available from the corresponding authors upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was partially supported by the Military Science Project of National Social Science Foundation (2019-SKJJ-C-092), Natural Science Foundation of Shaan Xi Province (no. 2020JQ-493), Military Equipment Research Project (WJ2020A020029), and Equipment Comprehensive Research Project (WJY20211A030018).