Research Article
An Improved Quantum-Behaved Particle Swarm Optimization Algorithm Combined with Reinforcement Learning for AUV Path Planning
| Initialize the particles’ positions, global best, and personal best with their fitness value | | Initialize the weight vector of deep Q-network | | Compute the mean best and diversity of the particles using equations (5) and (18) | | While i = 1 to Maxiter | | Do for each particles | | Choose the best action | | Switch action | | Case normal | | Update the particles using equations (3), (4), and (19) | | Case exploration | | Update the particles using equations (3) and (4) | | Case particle explode | | Initialize the mbest | | Case random mutation | | Update the particles using equations (3), (4), and (20) | | Case Fine-tuning operation | | While j = 1 to 3 | | While k = 1 to K | | Update the particles using equations (21), (22), and (23) | | Compute the fitness value of personal best | | End | | End | | i = i + K − 1 | | Set an immediate reward using equation (17) | | End | | End | | Update the global best and personal best with their fitness value | | Compute the mean best and diversity of the particles using equations (5) and (18) | | Store transition in | | Sample random mini-batch of transitions from | | Calculate target value function | | Perform a gradient descent step on | | i = i + 1 | | End |
|