Research Article
Learning from Demonstrations and Human Evaluative Feedbacks: Handling Sparsity and Imperfection Using Inverse Reinforcement Learning Approach
| (1) | Input: , feature , , and number of interaction steps | | (2) | | | (3) | | | (4) | ; ; | | (5) | while teacher is not satisfied | | (6) | | | (7) | Interact with the environment for steps | | (8) | | | (9) | execute action according to | | (10) | if teacher critique for is received | | (11) | | | (12) | | | (13) | | | (14) | end interaction | | (15) | | | (16) | | | (17) | end while | | (18) | Output: |
|