Research Article
Supervised Reinforcement Learning for ULV Path Planning in Complex Warehouse Environment
Table 1
External rewards configuration.
| Reward item | Reward value |
| The ULV reaches the goal | +30 | The ULV completes all tasks | +30 | The ULV collides with an obstacle | -15 | The ULV collides with a wall | -15 | The ULV collides with another agent | -30 | The ULV moves a step | -0.1 | The ULV moves (a step) closer to the target | +0.6 |
|
|