Research Article

Supervised Reinforcement Learning for ULV Path Planning in Complex Warehouse Environment

Table 1

External rewards configuration.

Reward itemReward value

The ULV reaches the goal+30
The ULV completes all tasks+30
The ULV collides with an obstacle-15
The ULV collides with a wall-15
The ULV collides with another agent-30
The ULV moves a step-0.1
The ULV moves (a step) closer to the target+0.6