Research Article

Rapid Text Retrieval and Analysis Supporting Latent Dirichlet Allocation Based on Probabilistic Models

Table 3

Analysis of data according to English grammar based on LDA.

Word statisticsWord countCumulativePercentage of cumulative

Syllables2045204551.58
Sentences12621713.18
Unique words37125429.36
Average word length (char)4.62546.60.12
Average sentence length (word)10.22556.80.26
Monosyllabic words (1 syllable)7493305.818.83
Polysyllabic words (≥3 syllables)1793484.84.52
Syllables per word1.63486.40.04
Paragraph793565.41.99
Difficult Words 3993964.410.07