Research Article

Data Mining Technology Application in False Text Information Recognition

Table 3

Feature sequence after feature selection.

OrderFeature name (feature number)

1Number of Chinese characters (1)
2Total number of characters (2)
3different words (6)
4Number of words (5)
5Non-Chinese characters (4)
6Auxiliary word (U) (73)
7String punctuation (W) (76)
8Numeral (m) (68)
9“.” (11)
10Adverb (d) (63)
11Noun (n) (60)
12Average sentence length (9)
13Verb (V) (61)
14“,” (10)
15Temporal words (nt) (65)
16Of (30)
17Adjective (a) (62)
18Total number of sentences (80)
19Voluntary verb (vu) (78)
20Past tense marker (33)
21“;” (15)
22Hapax dislegomena (8)
23Preposition (p) (71)
24Hapax legomena(7)
25“/” (16)
26Patient (84)
27Pronoun (r) (66)
28Quantifier (q) (70)
29Conjunction (C) (72)
30As soon as (43)
31All (42)
32Total number of numeric characters (3)
33Descriptive word (vl) (64)
34get (31)
35Treatment (92)
36Effect (91)
37Also (40)
38Improve (90)
39“:” (14)
40Live (34)
41Than (24)
42Make (27)
43Side effect (83)
44“ ” ” (17)
45features (85)
46Safety (88)
47From (22)
48Interjection (e) (74)
49Four-word phrases (i) (79)
50Be (19)
51Relapse (94)
52Period (87)
53Like (26)
54Effective (95)
55“!” (13)
56Locative words (nS) (67)
57Symptom (82)
58Prefix (nh) (75)
59Auxiliary (used after an adverbial) (32)
60Then (52)
61Health (89)
62Significant (93)
63rare word (x) (77)
64Treatment (81)
65After (44)
66“?” (12)
67Later (50)
68Include (25)
69As (23)