Research Article

Efficient E-Mail Spam Detection Strategy Using Genetic Decision Tree Processing with NLP Features

Table 1

Description of dataset attributes.

AttributesTypeDescription

1–48char_freq_CHARThe number of characters in an e-mail that are the same as CHAR.
49–54capital_run_length_averageThe average length of consecutive capital letter sequences
55capital_run_length_longestLongest consecutive capital letter sequence length
56capital_run_length_longestLongest consecutive capital letter sequence length
57capital_run_length_totalOverall capital letters in e-mail
58Class attributeIndicating if an e-mail is classified as spam with class label (1) or not spam with class label (0)