Research Article

A Framework and Algorithm for Human-Robot Collaboration Based on Multimodal Reinforcement Learning

Algorithm 1

MRLC Multimodal Reinforcement Learning Cooperation
Input: User_speeches, User_body gestures, User_hand gestures, final_task, M(I, subtask), M(subtask, motion)
Initialize: NLP, sub_classifier, memory M, episode←0, load θ, Sub_classifier (User_speeches, User_body gestures, User_hand gestures), replace_ iter
Output: Motionrobot.
While not finishing final_task do:
s ← Sub_classifiers
   With probability ε to select a random intention i
   Otherwise use equation (1) to calculate i subtask ← M(i, subtask)
   Motion ← M(subtask, motion)
   Motionrobot ← Motion − Motionuserr ← NLP (feedback_speech)
   //s′ is the next behavior feature of User after robot executes Motionrobot
   s′ ← Sub_classifiers after Robote executes (Motionrobot)
   Calculate Reward rt according to equation (2)
   M ← (s, i, r, s′) batch_memory ← random choice (M)
   If s means the end of collaboration:
    y′ ← r
   Else:
    Use equation (3) to calculate y
   Use equation (4) to calculate loss
   Minimize loss
   If (episode > replace_ iter):
    θ ← θ
End