Research Article
Learning from Demonstrations and Human Evaluative Feedbacks: Handling Sparsity and Imperfection Using Inverse Reinforcement Learning Approach
| (1) | Input: , feature , , and learning rate | | (2) | | | (3) | compute teacher model {using equations (4)–(6)} | | (4) | enhance the demonstration {using equation (9)} | | (5) | while not converged | | (6) | compute , and {using equations (3) and (12)} | | (7) | compute {using equation (11)} | | (8) | | | (9) | end while | | (10) | Output: |
|