| HPSOAC Algorithm | 
| Input: Server set as , VM set as  and Task set as  with γ, š, , . | 
| Output: Reduce makespan Time, reduce energy consumption, increase usage of resource and balanced VM load. | 
| Assumption: | 
| Particle or AgentāVM | 
| PositionāLoad of VM allocated on PM | 
| VelocityāSpeed of task transfer | 
| PbestāIndividual VM performance | 
| GbestāOptimal result | 
| Iteration or EpisodeāTime period | 
| Information of Server and VMāMIPS, Memory, CPU and Bandwidth | 
| Information of TaskāLength and File size | 
| Start | 
| For i =1 to m | 
| For j =1 to n | 
| Incoming task is allocated on VM | 
| // For Makespan | 
| Calculate processing rate of VM, expected finishing time and task allocation time by applying Eq. (1), (2) and (3). | 
| =  | 
| =  /  | 
|  =  /  | 
| Find Finishing Time of each task on VM by the help of Eq. (4): | 
 | 
| Calculate Makespan Time by utilizing Eq. (6): | 
 | 
| //For Resource Utilization | 
| Compute Resource Utilization by using Eq. (8): | 
 | 
| End of for | 
| End of for | 
| //For Energy Consumption | 
| Initialize active and sleep VM, utilization of both VM and server. | 
| For j =1 to n | 
| For s =1 to x | 
| Compute active server by using Eq. (10): | 
 | 
| Compute energy consumption of active server by using Eq. (11): | 
 | 
| If () | 
| Put the server into sleep mode | 
| Else | 
| Wake-up the sleep server | 
| End of if | 
| Compute total energy consumption by using Eq. (17): | 
 | 
| End of for | 
| End of for | 
| // Applying HPSOAC | 
| //Initialize AC parameters | 
| Initialize j, e, , , ,  and  | 
| // Initialize PSO parameters | 
| Initialize the w, , , population size, the number of iterations,  | 
| For j =1 to n | 
| Initialize individual best , current position and velocity of agent j () | 
| End of for | 
| Initialize individual best , target position | 
| For e =1 to  | 
| For j =1 to n | 
| Observe the environment state . | 
| For t =1 to  | 
| Agent takes action  according to the Eq. (33): | 
 | 
| Receive the current reward  and perceives the next state . | 
| Compute value function by using Eq. (40): | 
 | 
| Compute TD error by using Eq. (28): | 
 | 
| Update eligibility trace in Eq. (34): | 
 | 
| Update critic parameters in Eq. (35): | 
 | 
| Compute advantage function in Eq. (25): | 
 | 
| Update policy gradient by using Eq. (41): | 
 | 
| Update policy parameters in Eq. (42): | 
 | 
| End of for | 
| // Compute fitness value | 
| Calculate cumulative reward  as the fitness value of agent. | 
| // Comparing current fitness with individual best | 
| If  | 
| Then  | 
| End if | 
| // Find out global best value from all individual best | 
| Find    | 
| If  | 
| Then  | 
| Else  āŖ Optimal solution | 
| End if | 
| Update episode | 
| Update weight and velocity according to Eq. (43) and (44) | 
| End of for | 
| End of for | 
| End |