Research Article
Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning
Table 3
Results for dynamic pricing with positive lead time.
| Method | Lifetime | | | | |
| PAQ-DQN | 2 | 3242.104 | 2911.139 | 254.905 | 291.997 | 3 | 4734.766 | 4455.814 | 40.439 | 110.146 | 4 | 4840.443 | 4714.712 | 34.066 | 28.642 |
| PAQ-A2C | 2 | 3081.546 | 2305.394 | 232.000 | 573.749 | 3 | 4513.786 | 4474.915 | 207.470 | 130.278 | 4 | 4919.332 | 2278.324 | 49.802 | 748.916 |
|
|