Research Article

Recognition of Emotions in Mexican Spanish Speech: An Approach Based on Acoustic Modelling of Emotion-Specific Vowels

Table 9

Average classification performance of the ASR system for speech and emotion recognition (individual speakers from the speech corpus) across five iterations.

Speaker Statistic Anger Happiness Neutral Sadness Word accuracy Phoneme accuracy

MS1 Average 100.0066.67100.00 100.00 78.7582.82
Std Dev 0.00 0.00 0.00 0.00 2.17 1.71

MS2 Average 81.90 86.67 100.00 100.00 86.59 88.32
Std Dev 11.74 12.640.00 0.00 5.21 4.70

MS3Average 90.00 95.00 100.00 100.00 93.17 93.83
Std Dev 9.13 11.180.00 0.00 3.52 3.09

FS1 Average 96.67 100.00 93.33 100.00 77.00 82.34
Std Dev 7.45 0.0014.91 0.00 6.59 4.41

FS2 Average 76.31 100.00 100.00 100.00 83.63 85.88
Std Dev 10.85 0.000.00 0.00 5.61 4.96

FS3 Average 77.26 100.00 100.00 100.00 84.75 86.46
Std Dev 9.96 0.000.00 0.00 7.56 5.85

Total (all speakers) Average 87.02 91.39 98.89 100.00 83.98
86.61
Std Dev12.4713.756.090.007.305.55