Research Article
Multirobot Coverage Path Planning Based on Deep Q-Network in Unknown Environment
Table 1
The average results of the last ten episodes when reward parameters change.
| R1 = 1 | Average results from the 191th episode to the 200th episode | R2 | R3 | R4 | Coverage rate | Repetition rate |
| 10 | 0 | −10 | 96.8% | 378% | 10 | −0.01 | −1 | 100% | 163.4% | 10 | −0.01 | −10 | 98.8% | 306.8% | 10 | −0.1 | −1 | 100% | 170.3% | 10 | −0.1 | −10 | 100% | 197.7% | 100 | −0.01 | −1 | 97.4% | 382% | 100 | −0.01 | −10 | 95.4% | 356.3% | 100 | −0.1 | −1 | 92% | 422.3% |
|
|