Research Article

A Topic Recognition Method of News Text Based on Word Embedding Enhancement

Table 1

The definition and description of symbols involved in this paper.

Symbol definitionDescription

The number of tokens in document
The vocabulary that including all words of corpus
The number of topics when training the LDA model
LDA model obtained by training on corpus
The number of word vector dimension when training the word embedding model
EMWord embedding model obtained by training on corpus
Document-level topic distribution for document
Word-level topic distribution for document
Text representation based on word embedding model and doc-level topic distribution
Text representation based on word embedding model and word-level topic distribution
Dimension of document representation vector
Topic distribution
Symbol that means concatenation operation of vectors
Symbol that means summation operation of vectors
Corpus set consists of training corpus and test corpus