Research Article
Deep Ensemble Reinforcement Learning with Multiple Deep Deterministic Policy Gradient Algorithm
Table 1
Performance comparison of subpolicies and the aggregated policy.
| | Policy | Episodes | Total reward | Average reward |
| | Subpolicy1 | 20 | 720.69 | 36.03 | | Subpolicy2 | 20 | 538.28 | 26.91 | | Subpolicy3 | 20 | 463.98 | 23.20 | | Aggregated policy | 20 | 829.17 | 41.46 |
|
|