Research Article

Enhanced Route Discovery Mechanism Using Improved CH Selection Using Q-Learning to Minimize Delay

Algorithm 1

Algorithm reward mechanism.
(1) // extract the states by collecting the route values as illustrated in the state definition earlier in the article//
(2) //create an empty Q-repository which will contain the Q-table and transition results to update the Q-table
(3) // set the value of k = 2 as there are two action labels namely optimal and nonoptimal
(4) //Apply k-means on the data gathered in itr time interval with k = 2, k-means will return route index as the action label is defined as 1 for optimal and 2 for nonoptimal//
(5) equation (7)
(6) /if the R-MSE is higher, the class is labelled as nonoptimal and vice-versa.//
(7) //initialize the Q-table as illustrated in Table 4.
(8) //SVM rating follows polynomial kernel for hyperplane separation//
(9) //the transition is said to be successful if the data gathered as state variables, maps the hyperplane while getting selected for the gradient satisfaction of the policies defined under . //
(10) where T’ .
(11)
(12) //represents that the route will get maximum reward for this action//
(13)
(14) //represents that the route will get discounted reward for this action.//
(15)Else
(16) //represents that the route will not get any reward for this action and might also be penalized depending upon the distance of the result to its ground truth.//
(17) //create a policy T-policy according to the transition actions and update the learning policy//
(18)//The proposed transition function introduces a semi-successful transition where if the state was selected during the plane policy of the transition function but was unsuccessful under the mapping policy, then the transition is called semi-successful and will get a partial reward as well.
(19) //where T is the transition function, is the selection strategy declared in kernel K and dignified under . //
(20) //calculate the defined parameters stated under equations (14) to (16).
(21) (20)//both neutralized throughput and pdr will surely be high if the delay is low so as the energy consumption.//
(22)
(23)culate //calculate the transition value of the route in the list by applying equation (20) to the state variable of the current route//
(24)>
(25) //dcb is the direct connection benefit in which if the transition function is satisfied completely, the value of dcb is .1//
(26) > ψ||  < ψ
(27)Reward = 
(28)Else
(29)Reward = 0
(30)End If
(31)Update Q-table
(32)End For
(33)
(34)Choose Route
(35)Add to List()
(36)Return List if transition completed.