| ) Initialization: | | () Set the minimum capacity values , , Exploration steps , , and where : . | | Select , , update , , and accumulated hypothesis/reward based on | | () if , then | | () Exploration: | | () for do | | () Select , , and update , (8) | | () Execute , observe and update | | () if then | | () Reward, | | () Update and , (11) | | () Update | | () else | | () Reward, | | () Update and , (11) | | () Update | | () end if | | () if , , , | | then | | () Select , , | | () else | | () Select , (12) | | () end if | | () Exploitation: | | () for do | | () Execute the action | | () end for | | () end for | | () end if |
|