Computational Intelligence and Neuroscience

Research Article

Efficient E-Mail Spam Detection Strategy Using Genetic Decision Tree Processing with NLP Features

Algorithm 4

Genetic Decision Tree Processing with Natural Language Processing (GDTPNLP).

*Input:* TextDocuments (T1, T2, T3, …, Tn) under different Category (C1, C2, C3, …, Cn)
Output: Classified-Textual-Features
	Step 1: Gather the text documents from 1 to n, such as T1, T2, …, Tn.
	Step 2: Accumulate the category of text documents such as C1, C2, C3, …, Cn.
	Step 3: Initiate the looping for each accumulated category of text documents.
	Step 4: Generate another looping structure for text documents from 1 to n.
	Step 5: Split the text document features separately.
	Step 6: Eliminate the StopWords.
	Step 7: Assign Frequency for text documents based on the category.
	Step 8: Generate Stemming Process based on the frequency generated in step 7.
	Step 9: Add the result to the data store and exit from the looping structures of step 3 and step 4.
	Pseudocode:
for (each text_Category [C1-to-Cn] w.r.t. [T1-to-Tn])
for (each text_Documents [T1-to-Tn] w.r.t. [C1-to-Cn])
Split the text_features and assign it to Fj
Eliminate the StopWords from the text_features and assign the resultant to SFj.
Assign the Fj + SFj to Tj
Generate the frequency for each Tj based on category (Cj).
Perform Stemming based on SFj & Cj.
Add the resultant features to Data store (DS).
end for
end for
	Step 10: Initiate the textual_population randomly.
	Step 11: Start the while loop and iterate up to maximum level of feature presented into the given text document.
	Step 12: Raise the iteration level by 1 until the loop ends.
	Step 13: Estimate the fitness function of every term frequency.
	Step 14: Select the fitness property for each iterative word over the loop.
	Step 15: Perform the crossover functionality.
	Step 16: Raise the text-population levels for every crossover iteration generated in step 15.
	Step 17: Apply the decision tree procedure with optimum value of confidence threshold.