Research Article

Stock Price Prediction Based on Natural Language Processing1

Table 7

Predictive variables after correlation coefficient screening.

Data typeWordsLagVariableCorr. Coef.

Seed keywordsCSI 30010.9979
Inflation rate10.6903
Chinese news1–0.6836
Policy100.6456
Dark horse100.6238
Stock quotes10.6130

Generated keywordsCSI 30010.9979
Compound interest10.7296
Hot money10.7096
Dividend10.6703
Profit10.6513
Annual interest20.6218