Research Article
A Topic Recognition Method of News Text Based on Word Embedding Enhancement
Table 1
The definition and description of symbols involved in this paper.
| Symbol definition | Description |
| | The number of tokens in document | | The vocabulary that including all words of corpus | | The number of topics when training the LDA model | | LDA model obtained by training on corpus | | The number of word vector dimension when training the word embedding model | EM | Word embedding model obtained by training on corpus | | Document-level topic distribution for document | | Word-level topic distribution for document | | Text representation based on word embedding model and doc-level topic distribution | | Text representation based on word embedding model and word-level topic distribution | | Dimension of document representation vector | | Topic distribution | | Symbol that means concatenation operation of vectors | | Symbol that means summation operation of vectors | | Corpus set consists of training corpus and test corpus |
|
|