Research Article

Vietnamese Sentiment Analysis under Limited Training Data Based on Deep Neural Networks

Table 2

The accuracy results of various preprocessing techniques for Vietnamese sentiment analysis based on machine learning classifiers.

DatasetsClassifiers(1)(2)(3)(4)(5)(6)(7)(8)(9)

Dataset 1LR0.8220.8230.8220.8230.8280.8230.8430.8520.829
SVM0.8320.8330.8320.8330.8300.8350.8520.8560.837
OVO0.8260.8270.8260.8260.8290.8280.8480.8520.833
OVR0.8260.8270.8260.8260.8290.8280.8480.8520.833

Dataset 2LR0.7070.7070.7070.7070.7030.7120.7400.7420.707
SVM0.6730.6720.6730.6730.6640.6860.7070.6980.673
OVO0.7150.7150.7150.7150.7050.7180.7480.7450.714
OVR0.6970.6990.6970.6980.6960.7080.7350.7380.698

Dataset 3LR0.7980.7970.7980.7980.7900.8060.8260.8020.800
SVM0.7990.7970.7990.8000.7890.8060.8220.8110.801
OVO0.8000.7990.8000.8010.7920.8070.8220.8130.803
OVR0.8000.7990.8000.8010.7920.8070.8220.8130.803

Dataset 4LR0.8050.8060.8050.8050.7980.8080.8280.8150.811
SVM0.8080.8100.8080.8100.8040.8130.8300.8220.812
OVO0.8060.8070.8060.8060.8040.8120.8300.8240.813
OVR0.8060.8070.8060.8060.8040.8120.8300.8240.813

(1) Without preprocessing (the baseline result); (2) number removal; (3) punctuation removal; (4) elongated characters removal; (5) POS tagging selection; (6) intensifier handling; (7) negation handling (replacing the lexicons); (8) negation handling (using pretrained models); (9) emoji icons substitution.