Research Article

Ensemble Machine Learning Model for Classification of Spam Product Reviews

Table 5

Ten best features selected by Information Gain for Yelp Dataset.

S. noFeaturesDescription

1rev_ratingReviews rating
2stdev_revApp_ratingStandard deviation of review rating and rating application
3stdev_revrating_avgrevratingappStandard deviation of review rating and average review rating application
4avg_cosine_similarity_textAverage cosine similarity in review text
5polarity_textPolarity of review text
6rev_pos_ascendReviews part of speech in ascending order
7rev_pos_descendReviews part of speech in descending order
8avg_levenshtein_dist_textAverage Levenshtein distance between reviews text
9automated_readability_index_textAutomated readability index (ARI) of review body
10avg_num_letters_per_wordAverage number of letters per word in review body