Research Article

A Topic Recognition Method of News Text Based on Word Embedding Enhancement

Table 4

Results of the 20NewsGroup in 20 classes for 7532 texts by SVM and LR.

ModelAverage 5-fold micro-F1 score of different dimensions
100200300400500
SVMLRSVMLRSVMLRSVMLRSVMLR

TF-IDF1000.3730.3722000.480.4753000.5860.5804000.6370.6315000.6710.663
LDA1000.6860.6822000.6920.6893000.7120.7114000.7150.7145000.7210.723
Glove1000.7240.7102000.7670.7543000.7840.7714000.7940.7805000.7990.788
SGL1000.7450.7322000.7790.7673000.7920.7804000.8070.7945000.8130.802
CGL2000.7820.7724000.8040.7926000.8120.7968000.8220.80610000.8260.813
Word2vec1000.7400.7332000.7650.7563000.7660.7554000.7690.7585000.7650.755
SWL1000.7470.7432000.7770.7693000.7820.7714000.7870.7775000.7870.777
CWL2000.7800.7764000.7950.7876000.7930.7828000.7930.78310000.7960.785

Bold indicates that values are the optimal results.