Research Article

Coordinated Learning by Model Difference Identification in Multiagent Systems with Sparse Interactions

Algorithm 1

A model-based method for identifying coordinated states of agent .
Input: Individual original MDP , Individual optimal -values of agent , threshold value
proportion , integer for Monte Carlo sampling times, exploration factor
Output: The set of coordinated states for agent
      // performing Monte Carlo sampling to get empirical local MDPs.
()    for do
()    % decreases down to a small value multiplying with factor ;
()     selects according to using local state with random policy ;
()     observes local , according to global environmental state and action;
()     records , and state information of other agents;
()     % updates individual -values according to received experience ;
()    end for
()    for agent and do
()     for state do
()    gets empirical local MDP of agent when agent is in state ;
()  end for
() end for
      // determining the coordinated states of agent according to the two MDPs.
() for each state , agent and , state do
()  computing all the model difference degree according to (4);
()  if is bigger than then
()    augments the coordinated state of agent in state to include of agent ;
()  end if
() end for