) Initialization: | () Set the minimum capacity values , , Exploration steps , , and where : . | Select , , update , , and accumulated hypothesis/reward based on | () if , then | () Exploration: | () for do | () Select , , and update , (8) | () Execute , observe and update | () if then | () Reward, | () Update and , (11) | () Update | () else | () Reward, | () Update and , (11) | () Update | () end if | () if , , , | then | () Select , , | () else | () Select , (12) | () end if | () Exploitation: | () for do | () Execute the action | () end for | () end for | () end if |
|