Research Article

FPT-Former: A Flexible Parallel Transformer of Recognizing Depression by Using Audiovisual Expert-Knowledge-Based Multimodal Measures

Table 6

Comparison between FPT-Former (when only one modality is used) and baseline of AVEC-2019.

ModelRMSE

AVEC 2019 baseline-FACS [19]7.02
AVEC 2019 baseline-MFCC [19]7.28
AVEC 2019 baseline-eGeMAPS [19]7.78
FPT-Former (FACS only)6.11
FPT-Former (MFCC only)7.02
FPT-Former (eGeMAPS only)6.60

The bold font indicates that the RMSE of our model is lower than the previous three models.