Mathematical Problems in Engineering

Research Article

HVAC Optimal Control with the Multistep-Actor Critic Algorithm in Large Action Spaces

DQN-based HVAC control algorithm.

(1)	Initialize replay memory D to capacity N
(2)	Initialize action-value function Q with random weights θ
(3)	Initialize target action-value function with random weights
(4)	For episode = 1 to M do
(5)	Reset building environment to initial state
(6)	Initialize sequence and preprocessed sequence
(7)	For t = 1 to T do
(8)	If t mod k = = 0 then
(9)	With probability select a random action a_t
(10)	Otherwise select
(11)	Execute action in emulator and observe reward and image x_t+1
(12)	Set and preprocess
(13)	Store transition () in D
(14)	Sample random minibatch of transitions () from D
(15)	Set y_i =
(16)	Train with respect to the network parameters θ
(17)	Every C steps reset
(18)	End if
(19)	End for
(20)	End for