Scientific Programming

Research Article

Enhanced Route Discovery Mechanism Using Improved CH Selection Using Q-Learning to Minimize Delay

Algorithm 1

Algorithm reward mechanism.


(1)	// extract the states by collecting the route values as illustrated in the state definition earlier in the article//
(2)	//create an empty Q-repository which will contain the Q-table and transition results to update the Q-table
(3)	// set the value of k = 2 as there are two action labels namely optimal and nonoptimal
(4)	//Apply k-means on the data gathered in itr time interval with k = 2, k-means will return route index as the action label is defined as 1 for optimal and 2 for nonoptimal//
(5)	equation (7)
(6)	/if the R-MSE is higher, the class is labelled as nonoptimal and vice-versa.//
(7)	//initialize the Q-table as illustrated in Table 4.
(8)	//SVM rating follows polynomial kernel for hyperplane separation//
(9)	//the transition is said to be successful if the data gathered as state variables, maps the hyperplane while getting selected for the gradient satisfaction of the policies defined under . //
(10)	where T’ .
(11)
(12)	//represents that the route will get maximum reward for this action//
(13)
(14)	//represents that the route will get discounted reward for this action.//
(15)	Else
(16)	//represents that the route will not get any reward for this action and might also be penalized depending upon the distance of the result to its ground truth.//
(17)	//create a policy T-policy according to the transition actions and update the learning policy//
(18)	//The proposed transition function introduces a semi-successful transition where if the state was selected during the plane policy of the transition function but was unsuccessful under the mapping policy, then the transition is called semi-successful and will get a partial reward as well.
(19)	//where T is the transition function, is the selection strategy declared in kernel K and dignified under . //
(20)	//calculate the defined parameters stated under equations (14) to (16).
(21)	(20)//both neutralized throughput and pdr will surely be high if the delay is low so as the energy consumption.//
(22)
(23)	culate //calculate the transition value of the route in the list by applying equation (20) to the state variable of the current route//
(24)	>
(25)	//dcb is the direct connection benefit in which if the transition function is satisfied completely, the value of dcb is .1//
(26)	> ψ\|\| < ψ
(27)	Reward =
(28)	Else
(29)	Reward = 0
(30)	End If
(31)	Update Q-table
(32)	End For
(33)
(34)	Choose Route
(35)	Add to List()
(36)	Return List if transition completed.