Research Article

Service Migration Policy Optimization considering User Mobility for E-Healthcare Applications

Algorithm 2

Value_evaluate (policy, env, max_step = 100, tol = 1e-6).
(1) initialization V
(2)  for i in range (max_step):
(3)   new_V = V.copy ()
(4)    for all s in S: # for every state, update the value fuction
(5)    qs = np.zeros ((N_ACTIONS), dtype = np.float32)//Initialize value fuction
(6)     for a in range (N_ACTIONS): //store the Q value for each action
(7)      n_s = env.P [s, a]
(8)      r = env.R [s, a]
(9)      n_V = V [n_s [0], n_s [1]]
(10)      qs [a] = r + gamma n_V
(11)      new_V [s] = np.sum (qs policy [i, j])
(12)     End for
(13)    End for
(14)    if np.sum (np.abs (V-new_V)) < tol:
(15)     break
(16)    V = new_V
(17)   End for
(18)   return V