Research Article
Experience Weighted Learning in Multiagent Systems
| (1) | repeat | | (2) | i = 0 | | (3) | Initialize Q (s, a) | | (4) | repeat | | (5) | Choose an action A using policy derived from Q (e.g., -greedy) | | (6) | Choose an opponent randomly | | (7) | Take action A and observe R, | | (8) | | | (9) | | | (10) | until S is terminal | | (11) | i = i + 1. | | (12) | until i = the total number of all the agents |
|