Research Article

Decentralized Reinforcement Learning Approach for Microgrid Energy Management in Stochastic Environment

Algorithm 1

//initialization
(1)  Initialize the learning parameters and ,
(2) Set K1, Tk, , μ,
(3) Initializes for all states and actions.
//learning
(4)For k = 1: K1
  //exploration period
(5)   For n = 1: Tk
(6)    For t = 1: 24
(7)    Each agent senses the states of the environment
(8)    The demand is predicted by the exponential random distribution
(9)    The outputs of wind and PV are determined.
(10)    Each agent takes random actions with the probability 1- and selects the best action with the probability based on
(11)    MO clears the market
(12)    Each agent observes its immediate reward
(13)    The -function for each agent is updated according to equation (6)
(14)    End
(15)   End
  //end exploration period
(16)   is updated with for each agent with the probability 1 − μ
(17)End
//end learning