Research Article
Experience Weighted Learning in Multiagent Systems
| Variables | Explanation |
| N (t) | The observed count of interactions in time t | | The action of an agent in time t | | The policy mapping action a to probability | | The discount rate for experience | | The decay of utility with respect to time | | The reward of action a | | The utility of taking action a |
|
|