Research Article
Breast Cancer Identification from Patients’ Tweet Streaming Using Machine Learning Solution on Spark
Table 3
The accuracy of 10-fold CV and the accuracy of the unseen dataset after correlation.
| Model | Accuracy of cross-validation (%) | Accuracy of testing data (%) | Best value of parameters (%) |
| LR | 99.06 | 98.7 | regPram: 0.1 maxIter: 20 | DT | 98.6 | 90.3 | impuity: gini maxDepth: 5 maxBins: 32 | SVM | 99.1 | 98.4 | regParam: 0.02 maxIter: 50 Kernal type: Liner | RF | 99.5 | 96.9 | maxDepth: 7 maxBins: 32 numTrees: 20 |
|
|