Review Article

Emotionally Intelligent Chatbots: A Systematic Literature Review

Table 6

Automatic evaluation metrics.

Evaluation typeMetric# of studiesStudies

Evaluation of generated responsesBLEU24[17, 39, 47, 4953, 55, 5759, 64, 65, 68, 69, 7276, 79, 83]
Perplexity20[17, 36, 48, 49, 51, 52, 54, 55, 6067, 72, 73, 75, 79]
Distinct-1 grams
Distinct-2 grams
12[36, 50, 53, 5659, 64, 67, 74, 76, 80]
ROUGE5[39, 52, 54, 59, 66]
METEO4[39, 48, 52, 66]

Evaluation of emotionsF15[47, 60, 71, 78, 80]
Precision9[17, 47, 48, 60, 67, 69, 71, 78, 84]
Recall8[47, 48, 60, 67, 69, 71, 78, 84]
Accuracy22[8, 17, 4852, 55, 57, 58, 61, 66, 68, 7173, 75, 76, 78, 80, 81, 84]