Research Article

COVID-19 Infodemic in Malaysia: Conceptualizing Fake News for Detection

Table 1

Summarized information for the related work on fake news detection.

Journal authorNatural language processing (NLP)Machine learning algorithms

Amer and Siddiqui [12]Tokenization
Stop word removal
Stemming
Term frequency-inverse document frequency (TF-IDF)
Count vectorizer
Random forest
Decision tree

Patwa et al. [13]Term frequency-inverse document frequency (TF-IDF)Decision tree
Logistic regression
Gradient boost
Support vector machines (SVM)

Madani et al. [14]Tokenization
Stop words removal
URLs and punctuation removal
Hashtag extraction
Words with character removal
Stemming
TextBlob library–sentiment analysis
Logistic regression
Decision tree
Naïve Bayes
Support vector machines (SVM)
Random forest
Gradient boosting
Multilayer perceptron (MLP)

Elhadad et al. [15]Text parsing
Part of speech (POS) tagging
Stop words removal
Stemming
Term frequency-inverse document frequency (TF-IDF)–unigram, bigram, trigram, N-gram (2 : 3)
Word embeddings
5-fold cross validation
Decision tree
kNN
Logistic regression
Linear support vector
Machines
Multinomial Naïve Bayes
Bernoulli Naïve Bayes
Perceptron
Neural network
Ensemble random Forest
Extreme gradient boosting (XGBoost)

Felber [16]Stop word removal
Link removal
Lemmatization/stemming
Reply removal
Lowercase transformation
XML entity replacement
Linear support vector
Machines
Logistic regression
Multilayer perceptron
Naïve Bayes
Random forest

Pathwar and Gill [17]Stop words removal
Stemming
Word embedding
Term frequency-inverse document frequency (TF-IDF)
Gradient boost
kNN
Multinomial Naïve Bayes
Random forest
Convolutional neural network (CNN)
Dense neural network (DNN)
Recurrent neural network (RNN)
Random multimodel deep learning–CNN, DNN, RNN

Bondielli and Marcelloni [11]Sentiment analysis
Opinion mining
Support vector machines
Decision tree
Random forest
Logistic regression
Recurrent neural networks (RNN)
Convolutional neural networks (CNN)
Factor analysis of mixed data (FAMD)

Ahmad et al. [18]Voting classifier
(1) Logistic regression, random forest, kNN
(2) Logistic regression, linear support vector machines, classification and regression tree (CART)
Machine learning algorithms
Logistic regression
Support vector machines (SVM)
Multilayer perceptron kNN
Ensemble learners
Random forest
Bagging ensemble classifier
Boosting ensemble classifier
Voting ensemble classifier

Joju and Kammath [19]Logistic regression
Naïve Bayes
Passive aggressive classifiers
Random forest
Support vector machines