Research Article
Data Mining Technology Application in False Text Information Recognition
Table 3
Feature sequence after feature selection.
| Order | Feature name (feature number) |
| 1 | Number of Chinese characters (1) | 2 | Total number of characters (2) | 3 | different words (6) | 4 | Number of words (5) | 5 | Non-Chinese characters (4) | 6 | Auxiliary word (U) (73) | 7 | String punctuation (W) (76) | 8 | Numeral (m) (68) | 9 | “.” (11) | 10 | Adverb (d) (63) | 11 | Noun (n) (60) | 12 | Average sentence length (9) | 13 | Verb (V) (61) | 14 | “,” (10) | 15 | Temporal words (nt) (65) | 16 | Of (30) | 17 | Adjective (a) (62) | 18 | Total number of sentences (80) | 19 | Voluntary verb (vu) (78) | 20 | Past tense marker (33) | 21 | “;” (15) | 22 | Hapax dislegomena (8) | 23 | Preposition (p) (71) | 24 | Hapax legomena(7) | 25 | “/” (16) | 26 | Patient (84) | 27 | Pronoun (r) (66) | 28 | Quantifier (q) (70) | 29 | Conjunction (C) (72) | 30 | As soon as (43) | 31 | All (42) | 32 | Total number of numeric characters (3) | 33 | Descriptive word (vl) (64) | 34 | get (31) | 35 | Treatment (92) | 36 | Effect (91) | 37 | Also (40) | 38 | Improve (90) | 39 | “:” (14) | 40 | Live (34) | 41 | Than (24) | 42 | Make (27) | 43 | Side effect (83) | 44 | “ ” ” (17) | 45 | features (85) | 46 | Safety (88) | 47 | From (22) | 48 | Interjection (e) (74) | 49 | Four-word phrases (i) (79) | 50 | Be (19) | 51 | Relapse (94) | 52 | Period (87) | 53 | Like (26) | 54 | Effective (95) | 55 | “!” (13) | 56 | Locative words (nS) (67) | 57 | Symptom (82) | 58 | Prefix (nh) (75) | 59 | Auxiliary (used after an adverbial) (32) | 60 | Then (52) | 61 | Health (89) | 62 | Significant (93) | 63 | rare word (x) (77) | 64 | Treatment (81) | 65 | After (44) | 66 | “?” (12) | 67 | Later (50) | 68 | Include (25) | 69 | As (23) |
|
|