Research Article

Investigating the Effects of Hyperparameters in Quantum-Enhanced Deep Reinforcement Learning

Table 3

The method of calculating the expectation value for action selection (this is the assumption).

 

Total number of repeated measurements500500500500
Total number of measurements which gives 1330400350190
Total number of measurements which gives 0170100150310
Probability of getting 1 or P (1)0.660.80.70.38
Probability of getting 0 or P (0)0.340.20.30.62
Expectation value0.660.80.70.62