Research Article
Vietnamese Sentiment Analysis under Limited Training Data Based on Deep Neural Networks
Table 3
The accuracy results of various data augmentation techniques for Vietnamese sentiment analysis based on machine learning classifiers.
| Datasets | Classifiers | (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) |
| Dataset 1 | LR | 0.850 | 0.864 | 0.848 | 0.865 | 0.863 | 0.856 | 0.858 | 0.858 | SVM | 0.859 | 0.852 | 0.812 | 0.860 | 0.865 | 0.856 | 0.861 | 0.831 | OVO | 0.854 | 0.857 | 0.854 | 0.863 | 0.864 | 0.856 | 0.862 | 0.858 | OVR | 0.854 | 0.857 | 0.854 | 0.863 | 0.864 | 0.856 | 0.862 | 0.858 |
| Dataset 2 | LR | 0.742 | 0.743 | 0.752 | 0.754 | 0.752 | 0.751 | 0.749 | 0.748 | SVM | 0.721 | 0.741 | 0.711 | 0.754 | 0.736 | 0.746 | 0.720 | 0.732 | OVO | 0.751 | 0.745 | 0.745 | 0.756 | 0.759 | 0.747 | 0.751 | 0.750 | OVR | 0.734 | 0.731 | 0.738 | 0.741 | 0.747 | 0.738 | 0.739 | 0.738 |
| Dataset 3 | LR | 0.826 | 0.831 | 0.832 | 0.838 | 0.831 | 0.825 | 0.828 | 0.826 | SVM | 0.824 | 0.829 | 0.825 | 0.832 | 0.831 | 0.820 | 0.826 | 0.826 | OVO | 0.823 | 0.829 | 0.827 | 0.831 | 0.829 | 0.819 | 0.825 | 0.826 | OVR | 0.823 | 0.829 | 0.827 | 0.831 | 0.829 | 0.819 | 0.825 | 0.826 |
| Dataset 4 | LR | 0.816 | 0.830 | 0.829 | 0.829 | 0.820 | 0.814 | 0.822 | 0.821 | SVM | 0.828 | 0.826 | 0.833 | 0.831 | 0.834 | 0.817 | 0.826 | 0.827 | OVO | 0.826 | 0.826 | 0.833 | 0.829 | 0.834 | 0.816 | 0.826 | 0.827 | OVR | 0.826 | 0.826 | 0.833 | 0.829 | 0.834 | 0.816 | 0.826 | 0.827 |
|
|
(1) Preprocessing techniques; (2) EDA; (3) sentence shuffling; (4) back translation; (5) syntax-tree transformation; (6) contextual substitution (w2v); (7) contextual substitution ( ā+ā ); (8) masked language model (PhoBERT). |