Journal of Nanomaterials

Research Article

Relevant-Based Feature Ranking (RBFR) Method for Text Classification Based on Machine Learning Algorithm

Comparison of recent works related to imbalanced classification.


Reference	Technique	Methodology	Comments

[23]	Extreme gradient boosting	Time-, frequency-, and spatial-based features were extracted by the proposed algorithm. Random forest is used for classification.	Correlation in time-based features can be improved. Embedded FS can be incorporated.
[24]	Orthogonal least squares	The authors have improved the speed of fetching the best features using orthogonal least squares. They have compared mutual information and other embedded methods.	Multiple correlation coefficient and the canonical correlation coefficient can be improved when feature generation and instance generation methods are used.
[25]	Centroid mutation-based search	A set of features which can represent a strong convergence to a set of classes is identified. This increases the position of classification margin and reduces the error.	The noisy features can be identified and removed before finding the strong convergence.
[26]	Balanced pointwise mutual information	A deep learning model is employed in Twitter text classification. Special characters like emoji are used as features to classify tweets.	Spam detection can be implemented to increase the accuracy.
[27]	Term weighting	Most of the feature selection methods just use frequency. The authors used category information as additional metric to select features for classification.	Semantics information can degrade the performance of the classification.