Research Article

PPDRL: A Pretraining-and-Policy-Based Deep Reinforcement Learning Approach for QoS-Aware Service Composition

Algorithm 1

PPDRL(cs,C,T,B).
Require:: the given composite service, : the service class for
, : training steps, : batch size.
Ensure: the optimal QoS value for
 Initialize neural network params ;
 Generate initial samples;
while convergence condition is not satisfied do
 Update samples with better results if not the first cycle;
 Pretrain the neural network based on MLE;
for to do
 Given and , get the candidate services score distribution ;
;
;
  end for
end while