Research Article

Multirobot Coverage Path Planning Based on Deep Q-Network in Unknown Environment

Table 1

The average results of the last ten episodes when reward parameters change.

R1 = 1Average results from the 191th episode to the 200th episode
R2R3R4Coverage rateRepetition rate

100−1096.8%378%
10−0.01−1100%163.4%
10−0.01−1098.8%306.8%
10−0.1−1100%170.3%
10−0.1−10100%197.7%
100−0.01−197.4%382%
100−0.01−1095.4%356.3%
100−0.1−192%422.3%