[Retracted] Computational Intelligence Based Recurrent Neural Network for Identification Deceptive Review in the E-Commerce Domain
Table 1
Summary of selected textual and behavioral features with their equations for deceptiveness score calculation for labeling the product reviews of the Amazon dataset.
Feature ID
Explanation
Formula
Review example
Interpretation
F1
Proportion of strong positive words in the review content.
NPW denotes a number of positive words. RL refers to the length of the review content.
“I love this charger; it has two ports for my phone and iPod. Sleek and convenient to store, and I just love it.
It indicates the review contents are strongly positive and plays an important function in spamicity calculation (Jindal et al. [6]).
F2
Proportion of strong negative words in the review.
NNW states the number of negative words. RL refers to the length of the review content.
“I am disappointed that the 1A did not work with my iPad. That is what I get for buying a cheap adapter.”
This feature specifies whether the review contents are strongly negative or not and can help in the calculation of spamicity (Jindal et al. [6]).
F3
Percentage of numbers utilized in the review contents.
N refers to number of numeric digits utilized in the review content.
“I have 8 chargers. I have more than 4 vehicles, so I keep more than 1 of these in each.”
Increased utilization of numbers indicates that the review text is so specialized; thus, the review is regarded as a nonreview (Jindal et al. [6]). This review is a type 3 fake review.
F4
Reviewer’s writing authenticity.
Authenticity = ∑ PP + IP + excl (differ)−Negemo–Motion (5) This formula represents the total percentage of personal and impersonal pronouns, negative emotion words, and exclusive words in the review contents. We considered 50% to be a threshold value for this feature. Truthful review is ≥50%, whereas fake review is <50%.
“Surprisingly, this inexpensive version works just as well and just as reliably as the expensive variety. It has been working for me for months now. No problem. Excellent value.”
This feature shows the authenticity of reviewer writing. It is used to differentiate between true teller reviewers (nonspammer) and liar reviewers (spammer) (Newman et al. 2003). It has a score ranging from 1 to 100. The review mentioned in a neighbor cell has a 3.8% authenticity score; thus, it is tagged as a deceptive review.
F5
Analytic thinking of the reviewer.
Analytic thinking = ∑30 + articles + prep − PP − IMP − auxverb − conj − adverb − Negation. Articles, prep, PP, IMP, auxverb, conj, adverb, and negation represent the total percentage of articles, prepositions, personal pronouns, and impersonal pronouns in addition to auxiliary verbs, conjunctions, adverbs, and negation words in the text review.
“It worked great for the first couple of weeks, then it just stopped completely. So basically a small waste of money.” The above review has 68.29% analytic thinking value. It is tagged as a fake review.
This feature was used to calculate the analytic thinking of the reviewer. Spammers have an intelligent way of writing spam reviews. Based on the mentioned feature, we calculated and analyzed the degree of spammer’s thinking. It has a value ranging from 1% to 99% (Pennebaker et al. 2015).
F6
Percentage of comparative words that the reviewer used in the review text.
CW denotes the percentage of comparing words, and RL is the review length. Comparing words is like comparative and superlative adjectives in the review content.
“Samsung mobile chargers are of higher quality and durability than Huawei mobile chargers.”
This feature acts as the count of the superlative and comparative adjective words. It shows the comparison between the two entities. Such text written by the reviewer is labeled as a spammer.