Research Article
Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning
| Method | lifetime | Lead time | | |
| PAQ-DQN | 2 | 0 | 5377.021 | 18.747 | 1 | 3242.104 | 254.905 |
| PAQ-A2C | 2 | 0 | 5385.653 | 21.222 | 1 | 3081.546 | 232.000 |
| Q-learning | 2 | 0 | 4800.622 | 31.376 | 1 | 592.721 | 151.462 |
|
|