Research Article

Research on Dynamic Path Planning of Wheeled Robot Based on Deep Reinforcement Learning on the Slope Ground

Figure 9

The average cumulative reward curve. Each point is the average cumulative reward achieved per hundred episodes. The y-axis denotes the average cumulative reward and x-axis denotes iteration epoch.