Research Article

Detecting Web Spam Based on Novel Features from Web Page Source Code

Table 4

Results of using different 3 feature sets on random forest model.

Feature setAccuracyPrecisionRecallF1 scoreAUC

Selected existing features0.9110.9090.9110.9090.937
Novel features0.9110.9100.9110.9070.917
Selected existing + novel features0.9300.9290.9300.8290.957