Journal of Robotics

Research Article

Multirobot Coverage Path Planning Based on Deep Q-Network in Unknown Environment

Deduction method in MCPP.

	Initialization: maximum number of simulation steps , discount factor , temporary environment information , action of each robot , total rewards from action group , current reward from selected action
	Choose action groups for robots
	for to (k action groups) do
	while the number of current step do
	for to do
	if is feasible then
	Choose a feasible next action according to the value network,

	Update according to selected action
	else

	end if
	end for

	end while

	end for
	Choose an action group by