Research Article
An Empirical Investigation of Transfer Effects for Reinforcement Learning
Algorithm 2
The algorithm for training the non-transfer and transfer RL methods.
| input: Straining, TRQn−1[Sn−1, An−1] | | (1) | initialize | | (2) | new NRQn[Sn, An] | | (3) | new TRQn[Sn, An] | | (4) | TRQn[Sn, An] ⟵ TRQn−1[Sn−1, An−1] | | (5) | upper_bound = n + 1 | | (6) | Assign Straining to snt and str | | (7) | finish = FALSE | | (8) | NonTrans_Tr_Steps = 0 | | (9) | Trans_Tr_Steps = 0 | | (10) | repeat | | (11) | NRQn[Sn, An], Stepsnt = RL_Sort(snt , NRQn[Sn, An]) | | (12) TRQn[Sn, An] , Stepstr = RL_Sort(str , TRQn[Sn, An]) | | (13) | NonTrans_Tr_Steps = NonTrans_Tr_Steps + Stepsnt | | (14) | Trans_Tr_Steps = Trans_Tr_Steps + Stepstr | | (15) | Sort n! lists in Sn by NRQn, compute the average Avgnt and pick the list with max value as snt | | (16) | Sort n! lists in Sn by TRQn, compute the average Avgtr and pick the list with max value as str | | (17) | if (|Avgnt − Avgtr|/Avgtr <= 0.1) or (Avgnt <= upper_bound and Avgtr <= upper_bound) | | (18) | finish = TRUE | | (19) | until finish is TRUE |
|