Research Article

A Quantitative Study on Dream of the Red Chamber: Word-Length Distribution and Authorship Attribution

Table 1

Fitting of extended logarithmic to word-length distribution in six sample texts.

X[i]Text 20Text 40Text 60
F[i]NP[i]F[i]NP[i]F[i]NP[i]

111041104.0010841084.0011211121.00
2771775.60791797.05750755.69
310999.3311798.82113100.81
41621.07820.131622.51
θ = 0.2561, α = 0.4480,θ = 0.2480, α = 0.4580,θ = 0.2668, α = 0.4395,
X2 = 2.1888, X2 = 10.7041, X2 = 3.3976,
DF = 1, C = 0.0011, R2 = 0.9999DF = 1, C = 0.0054, R2 = 0.9994DF = 1, C = 0.0017, R2 = 0.9998
X[i]Text 80Text 100Text 120
F[i]NP[i]F[i]NP[i]F[i]NP[i]
110271027.0010221022.00962962.00
2863855.51874869.27913912.72
38798.978592.92105105.54
42215.271915.812019.74
502.65
610.61
θ = 0.2314, α = 0.4865,θ = 0.2138, α = 0.4890,θ = 0.2313, α = 0.5190,
X2 = 6.0484, X2 = 1.3447, X2 = 0.0062,
DF = 2, C = 0.0030, R2 = 0.9998DF = 1, C = 0.0007, R2 = 0.9999DF = 1, C = 0.0000, R2 = 1.0000

Note. X[i] is the observed classes of word length; F[i] is the observed frequency; NP[i] is the calculated frequency of the extended logarithmic (θ, α) distribution model.