Abstract

With the development of e-commerce, the last-mile delivery has become a significant part of customers’ shopping experience. In this paper, an autonomous last-mile delivery method using multiple unmanned ground vehicles is investigated. Being a smart logistics service, it provides a promising solution to reduce the delivery cost, improve efficiency, and avoid the spread of airborne diseases, such as SARS and COVID-19. By using a cooperation strategy with multiple heterogeneous robots, contactless parcel delivery can be carried out within apartment complexes efficiently. In this paper, the last-mile delivery with heterogeneous UGVs is formulated as an optimization problem aimed at minimizing the maximum makespan to complete all tasks. Then, a heuristic algorithm combining the Floyd’s algorithm and PSO algorithm is proposed for task assignment and path planning. This algorithm is further realized in a distributed scheme, with all robots in a swarm working together to obtain the best task schedule. A good solution with an optimized makespan is achieved by considering the constraints of various robots in terms of speed and payload. Simulations and experiments are carried out and the obtained results confirm the validity and applicability of the developed approaches.

1. Introduction

Unmanned ground vehicles (UGVs) have been successfully applied in diverse applications, such as planet exploration on the Moon and Mars [1, 2]. In recent years, UGVs have also been used for logistic services to reduce delivery costs [3]. Benefiting from autonomous technologies, UGVs can operate 24 hours per day and 7 days per week, which provides flexibility in terms of service time compared with conventional employee scheduling. Given the environment in which UGVs run, task assignment and path planning should be performed considering the parameters of parcels and UGVs. However, it is challenging to find an optimal solution with multiple UGVs under physical constraints. This problem is known as the vehicle routing problem (VRP), which was proved to be NP-hard [4].

The VRP model was first proposed by Dantzig and Ramser in 1959 [5] and has received increasing attention since then. A typical VRP aims at optimizing product delivery by vehicles from one depot or multiple depots to customers under certain constraints. There exist many extensions of the VRP, which can be classified into different categories according to configurations, problem modelling, solving algorithms, and objectives [616]. For example, this problem can be formulated differently considering homogeneous vehicles or heterogeneous vehicles [6, 7]. It can be further extended by adding different constraints to the original definition such as time constraints, traveling distance constraints, and capacity constraints [11, 12, 15, 17].

Typically, at the algorithm level, a VRP to solve is abstracted as a graph associated with vertices and arcs, such as a directed graph, an undirected graph, a connected graph, or a weighted graph. In the last decades, many algorithms were proposed and investigated, which include those based on branch and bound, cutting planes, dynamic programming, and heuristic approaches. Due to the complexity of VRPs, heuristic algorithms often perform better in terms of computing time and flexibility and therefore are often preferred in practical applications. In [14], genetic algorithms are applied to solve the multidepot VRP, considering total traveling distance and total traveling time as two objectives. In [4], enhanced ant colony optimization (ACO) algorithms are proposed, which allow ants visit the depots more than once until they reach all customers. In [18], a heuristic algorithm combining a variable neighborhood search algorithm and a space saving heuristic algorithm is proposed for logistic optimization. However, most researches mainly considered homogeneous UGVs which are not always true in reality.

In this paper, the multiple heterogeneous UGV model is adopted for last-mile delivery within apartment complexes. The delivery service is divided into the VRP and the bin packing problem to achieve better performance. Utilizing the distributed feature of the PSO algorithm, we have developed a coordinate controller that can be installed within UGVs. The design of the controller is shown in Figure 1. Each controller consists of two parts, namely, the multinode coordinator and the PSO optimizer engine. This engine can be treated as a regular PSO optimizer that iterates to search in the solution space to obtain the best solution. The multinode coordinator acts as a lightweight coordinator for the whole team. Each controller has a multinode coordinator, while at any given time, only one coordinator is active, so that the whole team follows the guidance of one coordinator.

The coordinator takes care of the following tasks:(i)Generating a leader coordinator by competition;(ii)Sending messages to start and stop the optimization process;(iii)Unifying the best solution that is assigned to each UGV.The main contributions of this paper can be summarized as follows:(i)Formulate the last-mile delivery with heterogeneous UGVs as an optimization problem aimed at minimizing the maximum makespan;(ii)Propose an algorithm combining the Floyd’s algorithm and PSO algorithm for task assignment and path planning;(iii)Propose a distributed logistic control algorithm validated using semiphysical simulations and experiments.

The remainder of this paper is organized as follows. Some existing techniques are reviewed in Section 2. Section 3 presents the proposed mathematical formulation of the last-mile delivery problem with multiple heterogeneous UGVs. The proposed algorithm is introduced in Section 4. Coordinate controller for distributed logistic with multiple UGVs is given in Section 5. Validation experiments are reported in Section 6. Finally, Section 7 concludes the paper.

Planning the shortest path is a key issue for vehicles in a transport system. Various methods have been proposed so far, such as graph search-based planners, sampling-based planners, interpolating curve planners, and numerical optimization approaches [19]. In the following subsections, we mainly focus on graph search-based shortest path algorithms, heuristic searching algorithms, and bio-inspired optimization algorithms.

2.1. Graph Search-Based Shortest Path Algorithms

There exist many algorithms to solve the shortest path problem in a graph, among which we can find the Dijkstra’s algorithm, the Bellman–Ford algorithm, and the Floyd’s algorithm.

The Dijkstra’s algorithm was proposed by Eds Wybe Dijkstra in 1959 [20]. It uses a greedy algorithm model and is a typical alternative to the shortest path method. It is one of the most commonly used algorithms to find the shortest path in graph theory. The main idea of this algorithm is to calculate all paths from a certain point to other vertices, select the shortest path, and then add the point to the final shortest path.

The main difference between Dijkstra’s algorithm and Bellman–Ford algorithm is that the latter can use negative weights to determine the distance value [21]. The Bellman–Ford algorithm is mainly used to solve the shortest path from a single source and the distance between two points in a directed graph. If there is a loop between these two points, in order to prevent the shortest path to be calculated continuously, the Bellman–Ford algorithm can identify negative cycles and deal with the negative loop problem. Because the Bellman–Ford algorithm restarts from the same source point to perform the “relaxation” update operation every time, it is slower than Dijkstra’s algorithm for the same problem.

Floyd’s algorithm is a classic dynamic programming algorithm and was first published in 1962 [22]. It is an algorithm to solve the shortest path between any two points in a weighted graph. It can correctly handle the shortest path problem of directed graphs comprising some negative weights, and it is also used to calculate the transitive closure of directed graphs. In [23], the author improved the Floyd’s algorithm by replacing fixed weights with probability time-consuming weights. The least time path is calculated based on the edge probability and the time weight, which improved the speed of the Floyd algorithm.

To summarize, the Dijkstra’s algorithm adopts a greedy approach to calculate the shortest path, whereas both the Bellman–Ford algorithm and the Floyd’s algorithm use dynamic programming.

The algorithms mentioned above have been widely used in seeking the shortest path for vehicles in transport systems, with modifications made according to the application background. In [24], a modified Dijkstra’s algorithm has been proposed for route planning in public transport systems. The algorithm takes the number of transfers and displacement distances into account, making it superior to the classic Dijkstra’s algorithm. In [25], the Dijkstra’s algorithm is modified to find the maximum load path to improve the freight transport efficiency. In [26], the Dijkstra’s algorithm is combined with the impedance function model to find the shortest path in a parking lot, taking the dynamic time of routes into account. These methods are further combined with heuristic searching algorithms. In [27], the Floyd’s algorithm is combined with the genetic algorithm (GA) for route planning in multimodal transport. The Floyd’s algorithm is used to find the shortest path and the GA algorithm is used to obtain the optimal transport mode. Following a similar idea, in [28], the Dijkstra’s algorithm is combined with an ant colony optimization to search the path for robots. The Dijkstra’s algorithm is applied to initial path planning, while a modified ant colony optimization algorithm is then applied to optimize the initial path. The algorithm proposed in our paper is also a combination of a classic shortest path algorithm enhanced with heuristic searching algorithm, combined with constraints of application background taking into account.

2.2. Heuristic Searching Algorithms

Theoretically, a graph search-based algorithm can usually provide the shortest path. However, with the increase of the scale of the problem, the computing time needed becomes intolerable. Heuristic searching algorithms are then developed to provide a solution within acceptable computing time. This is achieved by guiding the search direction with some kind of heuristic information. From this perspective, heuristic searching methods can be treated as an extension of graph search planners. The heuristic searching algorithms are usually used to solve practical applications. For example, a heuristic method was proposed to improve efficiency and merging capacity for highway T-junctions in [29].

Some significant representative heuristic searching algorithms are the algorithm and its dynamic version, the algorithm [30]. Extension of these algorithms includes the field [31], the anytime repairing , and the anytime [32]. So far, these methods have been widely used for path planning of different kinds of delivery vehicles. algorithm is used to guide UAVs to find a safe path in battle fields [33], and algorithm is combined with artificial bee colony to guide the motion of multiple UAVs in a 3D environment [34]. In an underwater environment, the algorithm is adopted to guide underwater robots to track chemical plume [35]. Heuristic searching algorithms also help UGVs achieve excellent performance in the DARPA Urban Challenge, where hybrid [36] and [37] were used in Stanford University’s UGV and KIT’s UGV, while was used in Boss, the winning vehicle [38]. In [39], a rule-based algorithm is presented to find the shortest path in the graphical domain for the emergency vehicle lane preclearing problem. In [40], the deterministic algorithm with a stepwise strategy is applied to solve a cooperative sorting problem for connected and automated vehicles in a multilane platoon.

2.3. Bio-Inspired Optimization Algorithms

Due to limitations of traditional path planning approaches, algorithms inspired by the observation of natural systems are developed to solve this NP complex problem [41], which attracts a growing interest of researchers. Among bio-inspired algorithms, the more widely used ones include the GA, the particle swarm optimization (PSO) algorithm, and the ant colony optimization (ACO) algorithm.

GA is a popular search-based algorithm, which was initially proposed by Bremermann in 1958 [42]. GA is a means to solve optimization problems. It generally relies on an objective function that must be maximized or minimized under specified constraints. In this case, a population of candidate solutions is assigned to a given problem, and individuals in the population are assigned adaptive capacity values according to the objective function. Individuals in the population are selected according to the environment, and gene transfer can be carried out by exchanging genetic materials that define possible solutions. Mutations in the population ensure diversity and avoid premature convergence of the algorithm, which may lead to missing the optimal result. Compared with local algorithms, genetic algorithms can make better use of historical information. Shibata et al. proposed a strategy using GA in a static environment for the application of path planning [43]. In an unknown environment, GAs only need very little prior information to complete path planning and an implementation can be found in [44].

PSO is a heuristic algorithm based on natural phenomena, which was initially intended for simulating the social behavior of fish or birds. The basic idea was first proposed by Eberhart and Kennedy in 1995 [45] and soon became a popular optimization algorithm used to solve various problems in engineering and science. It does not require any individual leader and the required solution is obtained via guiding the particles’ search by particles close to the food. Each particle considered through PSO represents a possible solution in the algorithm. At present, PSO is widely used in many fields, including multirobot positioning and path planning [4649]. The use of PSO helps to reduce the computational effort to find a solution, maintain stable convergence characteristics, avoid falling into local optimum, and obtain higher-quality path planning results.

Ant system is an intelligent bionic optimization algorithm inspired by the foraging behavior characteristics of ant populations in nature by Italian researcher Dorigo [50]. It is a basic ACO algorithm, which provides a framework for other ACO algorithms. ACO has some drawbacks such as slow convergence speed, a tendency to fall into local optimum, and premature convergence. In response to these problems, many researchers have proposed effective improvements to the algorithm framework and structure, such as ant colony system [51, 52], max-min ant system [53], ant system with elitist strategy, and ant system with elitist strategy and ranking [54]. These improved algorithms have effectively improved the optimization ability, but a fixed pattern is used to update the pheromone and probability transfer rules, which lacks flexibility and fails to solve the problem of premature convergence.

In this paper, we combine the Floyd’s algorithm with PSO algorithm to solve the last-mile delivery with multiple heterogeneous UGVs. The PSO algorithm is adopted to obtain the best task assignment solution for each UGV. The Floyd’s algorithm is then applied to provide the waypoint sequence a UGV should follow. More details are presented in the following section.

3. Problem Formulation considering Multiple Heterogeneous UGVs

The problem to be solved in this paper can be modeled as follows. We use robots to deliver packages to destinations from a warehouse. Robots are identified by an index , ranging from 1 to . Generally, the number of packages are not the same as the number of destinations, because there is a chance that some destinations receive no package, while some receive more than one package. As a result, is generally different from . In practice, we can ignore the destinations that receive no package. Meanwhile, if multiple packages must be delivered to the same destination, they can be treated as one big package. Thus, the problem of package delivery can be simplified and we get . The maximum weight a robot can carry is , and the weight of each package to its destination is . The cost for robot moving from the warehouse (represented by 0) or destination to destination is represented as . The cost depends on the length of the path it chooses. The problem that needs to be solved is thus to spend minimum cost to deliver all the packages to their destinations. We define two more variables as follows:

Thus, the problem can be modeled as minimizing total delivery cost. Then, we havewhere stands for the shortest time for robot returning to the warehouse from destination . Equation (2b) implies that at one time, the weight of all packages carried by a robot should not exceed its ability. However, the weight of all packages allocated to a robot can exceed the ability of the robot in which case the packages are delivered through multiple deliveries. Equation (2c) implies that all packages are delivered.

The solution of the abovementioned problem is a set of and that can be encoded aswhere stands for the case when the th robot delivers a package to destination . Based on this solution, we can further obtain the sequence of points that guides the movement of each robot.

In this paper, we are seeking to provide a schedule for all robots so that the task can be accomplished as fast as possible. The schedule must satisfy the following constraints:(i)All packages are initially stored at a warehouse. Initially, the robots can be randomly deployed. After accomplishing the tasks, all robots need to return back to the warehouse.(ii)The packages carried by a robot should not be heavier than the maximum payload of the robot.

In order to solve this problem, we combine the Floyd’s algorithm with the PSO algorithm to find best solutions. The PSO algorithm is adopted to obtain the best task assignment. It specifies both the collection of packages delivered by certain robots and the order of delivering the packages. The Floyd algorithm is then adopted to provide the waypoint sequence the robot should follow.

To apply the PSO algorithm to solve this problem, the task assignment should be encoded as particles. Each particle is defined as a vector with multiple elements. The indexes of elements in one vector correspond to packages. The elements are generated randomly within a predefined range. The integer part of an element indicates the ID of a robot, and the fractional part indicates the order. The smaller the fractional part is, the earlier a package is delivered.

For example, if 2 robots are sent out to deliver 5 packages, a particle can be

This particle means that robot 1 delivers package 1 and package 3 since the first and third elements are 1.2 and 1.1, whose integer parts are 1. Package 3 should be delivered before package 1 because 1.1 is smaller than 1.2. Thus, the IDs of packages delivered by robot 1 are elements , and they are delivered following this order. Similarly, robot 2 is in charge of delivering elements with the corresponding order.

The fitness function can be given bywhere is the time needed for robot to finish its task. With this definition, the PSO algorithm can be applied to get the best task assignment solution. We then need to plan path for each robot. This is achieved by connecting the destinations of the packages. We use Floyd’s algorithm to get the shortest paths between pairs of the destinations, and the path for robots can be obtained by connecting these paths.

4. Hybrid Algorithm for Task Assignment and Path Planning

This section introduces the algorithm to assign tasks and plan path for robots. The input of the algorithm is three matrixes: (1) the adjacency matrix of the map , (2) the features matrix of the robots , and (3) the weight matrix of the packages . The map is structured as a graph, with the warehouse and destinations being the nodes of the graph and paths connect nodes with edges. The weights of the edges are the length of corresponding paths. The feature matrix indicates the features of the robots. Each line indicates a robot, with the first row being speed, the second row being maximum payload, the third row being the time for the robot to pick up a package, and the fourth row being the time to drop a package. The weight matrix indicates the weight of the packages.

The output of the algorithm is matrixes as , each corresponding to one robot. Each output matrix contains two rows. The first row contains a sequence of points of the way for the robot to follow, and the second row indicates the action the robot should perform at a specified waypoint.

The method proposed in this paper comprises three phases, as specified in Algorithm 1.

Input:
Output:
(1)Create with
(2)Create with
(3)For, do
(4)  For, do
(5)   For, do
(6)    If, then
(7)     
(8)     
(9)    End if
(10)   End for
(11)  End for
(12)End for
(13)Create swarm: , set
(14)
(15)For, do
(16)For, do
(17)  
(18)  
(19)  
(20)  Constrain each element of within ,
(21)  
(22)  If, then
(23)   
(24)  End if
(25)  
(26)End for
(27)End for
(28)
(29)For, do
(30)  
(31)  
(32)  If, then
(33)  
(34)  Else
(35)   
(36)End if
(37)For, do
(38)   
(39)   
(40)   Delete the first element from
(41)   
(42)   , the number of ’-’ is the same as
(43)   
(44)   
(45)   If, then
(46)   
(47)   Else
(48)   
(49)   End if
(50)  End for
(51)End for
(52)

The first phase executes the classical Floyd’s algorithm, as in lines . Taken as input, it creates two matrixes as and . Element in is the shortest distance between node and , and element indicates the shortest path from node to node is via .

The task sequence for each robot can then be obtained with the PSO algorithm, listed in lines 13–28. With the encoding scheme introduced in Section 3, the basic part of the PSO algorithm used here is the same as the classical one. The main difference is in the calculation of the fitness function. To calculate the fitness function of a particle, we first decode it, generating the package sequence for each robot. Then, we check the sequence to see if the accumulation of weight is over the maximum payload of the robot. If so, 0 is inserted into the sequence. As 0 is the label of the warehouse, this will force the robot to go back to the warehouse at a proper time. In this way, if the total weight of packages for a robot is over its maximum payload, the robot will divide the packages into several batches. From , we can get the shortest distance for a robot to finish the task following this sequence. Then, the time spent by each robot to complete its assignment can be obtained, and finally, we get the value of the fitness function.

In order to guide robots, we then calculate the waypoint sequences for the robots and the actions taken at each waypoint, listed in lines 29–48. This is achieved by calculating the shortest sequence of waypoints that connect each pair of nodes in the task sequence according to .

In this algorithm, the function returns a random real number between and . The function decodes a particle and returns sequences, and each of them represents a task sequence for a robot. The function returns the length of vector . The function provides a sequence of waypoints, based on Floyd’s algorithm. The function offers the fitness value of the particle , achieved with Algorithm 2. In this algorithm, we first decode the particle to obtain a sequence of tasks, i.e., the destinations of packages. We then insert 0, which indicates the warehouse, into proper positions in the sequence so that the total weight a robot is carrying will not be heavier than its maximum payload. Finally, we calculate the time spent by each robot to accomplish the task and choose the solution with the maximum fitness score as the return value.

Input:
Output:
(1)
(2)For, do
(3)  , get from
(4)  For, do
(5)   
(6)   If, then
(7)    Get from
(8)    Insert 0 after the element in
(9)
(10)    
(11)   Else
(12)    
(13)   End if
(14)  End for
(15)  
(16)  Get from
(17)End for
(18).

5. Coordinate Controller for Distributed Logistic with Multiple UGVs

Based on Algorithm 2, we have developed a controller that can be distributed on a set of UGVs. These controllers can communicate with each other, achieve collective decision-making, and finally drive each robot to deliver goods to corresponding targets following an optimized path.

To further speed up the process of obtaining the work plan, distributed optimization is adopted. With this scheme, the computational load is divided into different UGVs in the system. Then, a semiphysical simulation platform is developed, with a workstation working as the working environment that executes robot models. We randomly generate various initial conditions, i.e., random maps, features of robots, and weight of packages, to compare the performance of the coordinate controller. Finally, we install these coordinate controllers to a set of UGVs and carry out experiments in a simulated environment. As the simulated environment is very similar to the actual working scenario, the experiment can demonstrate the effectiveness of the controller.

Even through each controller in the swarm has a coordinator, at any time, only one of them acts as a leader. The leader keeps sending heartbeat packages, and once it fails, the whole swarm elects a new leader. To ensure only one robot is elected as the leader, we need to map some kind of unique ID into a metric that can be sorted or compared. In our work, we use the MAC address as a unique ID for each robot, and we map a MAC address into a number, which is contained in the heartbeat package of the leader. In case that a robot receives no heartbeat package for a period, it broadcasts a heartbeat package to its peers. If a robot receives a heartbeat package with a smaller ID, it is forced into silence, acting as a follower. Finally, only the robot with the smallest ID could emit and becomes the leader.

The leader robot is in charge of controlling the optimization process of the whole swarm, so that all robots can agree to the same work plan. There are several triggers that can start the optimization process for the swarm. The trigger can be any situation that changes conditions of the last optimization, such as failure of a robot, change of destination, or arrival of new packages. Once conditions change, a new plan should be obtained as fast as possible. Thus, the coordinator sends a stop message a period after sending a start signal. This period should be short, and in our case, we set it as one minute, which can be adjusted according to application requirements.

In order to avoid contradiction, all robots in the swarm should agree to the same work plan. The leader coordinator determines the current best particle as a work plan and broadcasts it to the whole swarm.

A PSO engine that runs on each robot is shown in Figure 2. During iterations, each time a robot finds a better solution, it broadcasts the solution to the swarm. Once receiving such message, a robot will compare the solution with its current global best solution. If it is a better solution, then the current global best solution is replaced. It can be proved that ignoring package loss, each PSO engine always agrees with the best particle of the whole swarm. The problem can be abstracted into the following mathematical problem.

There are N arrays recorded as , . is a parcel of the -th robot, which is generated periodically. is the best solution of the -th robot. For each array, randomly changed, and it is not necessary for each of all robots to change synchronously. The change of obeys the following rules:(1)Initially, (2) is updated as if (3)When any in a robot is updated, , if , .

It can be ensured that the system can converge to . is the smallest number that has been generated so far for the swarm. Strict proof is given as follows.

Assume array generates a and ; according to rule (2) above, is updated only when . According to rule (3), if , is not updated, and thus we have that the current is smaller than any ever generated. Assume at , is updated. Meanwhile, will also try to update according to rule (3). There are two situations:(1)If , then so .(2)If , according to rule (1), this implies that . So, has at least been updated once. Assume that the last time is updated at . Thus, the value of is marked as , and consequently, the value of is . If , then should be updated according to rule (3), and we have . Then, we have . According to rule (2), should not be updated at .

Therefore, the second situation mentioned above is false and we can directly get the solution that is the same value. Thus, is the smallest value ever generated.

Mapping the problem mentioned above to our distributed PSO algorithm, we have that all PSO engines share the same , and thus, it is similar to the PSO algorithm running on the same computer. Besides, according to the problem mentioned above, synchronization is not necessary. Thus, the controller can run on a set of heterogeneous robots.

6. Simulation Study

6.1. Semiphysical Simulation

Based on the abovementioned ideas, we developed the controller using Raspberry Pi 3b+. This controller is called the distributed logistic controller (DLC). To help develop the controller and test its performance, we also developed a semiphysical simulation platform. Then, we randomly generated multiple maps with packages with different weights, and simulation experiments were carried out to compare the performance of the classic centralized PSO controller and the DLC.

The design of the semiphysical simulation platform is shown in Figure 3. It is developed with Qt Creator and is running on a Dell Precision 3630 workstation. It imitates the environment and dynamics models of UGVs. Multiple Raspberry Pi boards connect the simulation platform via Ethernet using a switch. Each Raspberry Pi runs a PSO optimizer engine to find optimized solutions for all robots, including task assignment and path planning. The DLC is also implemented with a Raspberry Pi to verify the system performance.

The simulation platform consists of the following parts:(1)UI interface, displaying the status of the robots and allowing people to control the simulation process(2)Robot models, equipped with the low layer of a PID controller and dynamic model of UGVs, so that the platform can simulate the performance of the UGVs(3)Simulation engine, basic structure of the simulation platform(4)Virtual environment, loaded map that can interact with the robot models(5)Virtual-physical interface, used to connect the robot controller into the simulation platform.

Besides debugging the program during the development of the controller, we used the platform to test the performance of the controller. The simulation platform is given in Figure 3. The reason why we developed a distributed controller is to deal with the situation where a high performance workstation is unavailable. The simulation process is shown in Figure 4.

Various experiments were carried out on the half-physical simulation platform to test the performance of the controller. For verifying the results, the adaptive cat swarm optimization (ACSO) in [55] and the classic centralized logistic controller are used to compare with the proposed distributed controller. As the optimization algorithm has a feature of randomness, we generate multiple maps randomly and repeat simulation of each map multiple times. The maps were generated with a different number of packages, i.e., 50 packages, 100 packages, 150 packages, and 200 packages, each with 5 maps. Meanwhile, we repeat a simulation 10 times to get a statistical conclusion. To ensure the existence of feasible solutions, all maps are connected graphs. Figure 4 shows one map structure.

For each map, three kinds of controllers are tested. The first controller is a classic centralized PSO controller, the second one is the ACSO controller in [55], and the last one is our DLC controller. For each simulation, we use a robot swarm consisting of four UGVs to deliver packages. The weight of each package is randomly generated along with each map. The weights are within the range of (0, ), where is the maximum payload capability among all the robots. These hypotheses ensure the existence of a feasible solution. The UGVs are heterogeneous, and their features are shown in Table 1.

With the conditions mentioned above, we carried out 600 experiments in total. As real controllers are connected into the emulation platform, the simulation results represent the actual performance and the working condition of the controller. For all the experiments, the DLC controllers always provided a feasible solution, with no failure occurring. This indicates that the DLC controller can work reliably and can be directly used on real robots. The simulation results are shown in Figure 5. It indicates that DLC provides better solutions than the classic PSO controller. As shown in the figures, the experimental results of the classic PSO controller are painted in blue, the experimental results of the ACSO controller are painted in green, and the performance with the DLC is painted in red. It is obvious that the mean delivery time using the DLC is the shortest among all three controllers. Furthermore, in most cases, the delivery time variation of experiments is smaller with DLCs than that with classic PSO controllers or with the ACSO controller, which improves the system performance.

The time complexity of the proposed DLC can be expressed as , with being the number of iterations, being the swarm size, being the number of robots deployed, and being the number of packages to be delivered. Besides time complexity, we also care about the convergence performance of the solution provided by the DLC. Facing a specific NP-hard problem, whose analytical optimum solution is hard to be obtained, we cannot guarantee to obtain the global optimum solution. However, we can verify the extent to which it performs better than available methods with a large number of experiments. Furthermore, statistical conclusions can be drawn based on experiment results. We define a metric as . For the same set of experimental conditions, we can obtain a set of solutions. is the mean value of solutions obtained with either a PSO controller or the ACSO controller, and is the mean value of the solution obtained with the DLC. It represents a normalization of the extra mean task makespan taking the mean value of the DLC as a benchmark. It turns out that compared with a PSO controller, the values of for maps with 50, 75, 100, and 150 packages are 4.87 %, 12.51 %, 10.28 %, and 8.95 %, respectively. Compared with the ACSO controller, the values of for maps with 50, 75, 100, and 150 packages are 11.80 %, 16.50 %, 11.63 %, and 10.79 %, respectively. As time spent on every experiment is the same, the results indicated that the DLC has a higher chance to converge faster than the other two methods in our experiments. We further define the metric , in which is the standard deviation of a set of solutions, and is the mean value of this set of solutions. This metric is a normalization of the standard deviation of a set of solutions. For the DLC, the values of for maps with 50, 75, 100, and 150 packages are , , , and , respectively. The values indicate that the DLC can provide stable solutions in spite of changes in maps.

6.2. Experiment with UGVs

Finally, we installed these controllers into UGVs and experiments were carried out. In these experiments, goods are delivered by three different robots. The features of the robots are shown in Table 2. In the table, we report the maximum payload and the speed of each robot. We assume that each robot spends constant time to pickup or drop a package. The time is also shown in the table. The arena consists of 15 houses and 1 warehouse, with the map shown in Figure 6.

We assume a truck will carry all the packages and the robots to the warehouse, and then robots are sent out to deliver each package to its destination. We assign label 0 to the warehouse, and the labels of the houses increase from 1 to 15. The weight of the packages for each house is shown in Table 3.

The algorithm proposed in this paper was used to solve the task assignment and path planning problems for a multirobot system. We assign the tasks for each robot and plan the paths for them to follow, so that the task can be accomplished in the shortest time. In order to achieve this purpose, we first build a graph model according to the map shown in Figure 6. Both the warehouse and the houses are treated as nodes. If a robot can move from one node directly to another node, without passing by a third node, we connect the two nodes with a edge, with the distance between the two nodes assigned as the weight of the edge. The graph model of the environment is shown in Figure 7. The adjacency matrix can then be built based on the graph model.

For the PSO algorithm, we set the swarm size as 50 and the total iteration time as 200. The minimum time to complete all tasks is about 727 s, and this solution can be obtained within 30 seconds. This makespan value does not vary much, even over a large number of iterations. We also obtain the tasks and paths for each robot, as shown in Tables 4, 5, and 6. In these tables, the waypoint sequences are given for a robot to follow, and its corresponding behavior is also listed, such as pickup packages (pp), drop packages (dp), or just pass by the waypoint (—). The physical simulation scene is shown in Figure 8.

7. Conclusion

In this paper, we consider the last-mile delivery problem with multiple heterogeneous unmanned ground vehicles (UGVs), which is formulated as an optimization problem aimed at minimizing the maximum makespan. The proposed algorithm combines the Floyd’s algorithm and the particle swarm optimization (PSO) algorithm for task assignment and path planning. A proposed distributed logistic controller enables UGVs to achieve excellent solutions in dynamic environments. A distributed logistic controller (DLC) is developed to control multiple UGVs. Semiphysical simulation results with the DLC have the shortest mean delivery time compared with using a classic PSO controller and the ACSO controller. In most cases, the observed experimental delivery time variations are smaller with DLCs than that with classic PSO controllers or with the ACSO controller. Experiments with UGV prototypes were carried out. The results confirm the validity and applicability of the developed approach. Goods are delivered with optimized makespan obtained with the proposed algorithm.

Abbreviations

UGV:Unmanned ground vehicle.
PSO:Particle swarm optimization.
VRP:Vehicle routing problem.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Yuzhan Wu conceptualized the study and developed methodology. Yuzhan Wu, Yuanhao Ding, Susheng Ding, and Meng Li validated the study. Yuzhan Wu wrote the original draft. Yvon Savaria and Meng Li reviewed and edited the article. All authors have read and agreed to the published version of the manuscript.