Enhanced Route Discovery Mechanism Using Improved CH Selection Using Q-Learning to Minimize Delay
Algorithm 1
Algorithm reward mechanism.
(1)
// extract the states by collecting the route values as illustrated in the state definition earlier in the article//
(2)
//create an empty Q-repository which will contain the Q-table and transition results to update the Q-table
(3)
// set the value of k = 2 as there are two action labels namely optimal and nonoptimal
(4)
//Apply k-means on the data gathered in itr time interval with k = 2, k-means will return route index as the action label is defined as 1 for optimal and 2 for nonoptimal//
/if the R-MSE is higher, the class is labelled as nonoptimal and vice-versa.//
(7)
//initialize the Q-table as illustrated in Table 4.
(8)
//SVM rating follows polynomial kernel for hyperplane separation//
(9)
//the transition is said to be successful if the data gathered as state variables, maps the hyperplane while getting selected for the gradient satisfaction of the policies defined under . //
(10)
where T’ .
(11)
(12)
//represents that the route will get maximum reward for this action//
(13)
(14)
//represents that the route will get discounted reward for this action.//
(15)
Else
(16)
//represents that the route will not get any reward for this action and might also be penalized depending upon the distance of the result to its ground truth.//
(17)
//create a policy T-policy according to the transition actions and update the learning policy//
(18)
//The proposed transition function introduces a semi-successful transition where if the state was selected during the plane policy of the transition function but was unsuccessful under the mapping policy, then the transition is called semi-successful and will get a partial reward as well.
(19)
//where T is the transition function, is the selection strategy declared in kernel K and dignified under . //
(20)
//calculate the defined parameters stated under equations (14) to (16).
(21)
(20)//both neutralized throughput and pdr will surely be high if the delay is low so as the energy consumption.//
(22)
(23)
culate //calculate the transition value of the route in the list by applying equation (20) to the state variable of the current route//
(24)
>
(25)
//dcb is the direct connection benefit in which if the transition function is satisfied completely, the value of dcb is .1//