Research Article

HVAC Optimal Control with the Multistep-Actor Critic Algorithm in Large Action Spaces

Algorithm 2

Multistep-Actor Critic algorithm.
(1)Build DQN, constructed by basic actions
(2)Train DQN, value-network and transition model
(3)For (every 30 minute)
(4) Construct search tree based on transition model and basic actions
(5) While action a is not ( = [0, 0, 0, 0])
(6)  Choose best action by DQN, and then put this choice to search tree
(7)  Expand search tree based on this choice and transition model
(8) Choose k states close to original state by KNN from state set
(9) Based on the value-network, find the best state that yields the lowest energy consumption
(10) Change the set point
(11)End for