Research Article
Machine Learning-Based Prediction Model of Preterm Birth Using Electronic Health Record
Table 3
The performance of models in the test set.
| | Models | Accuracy | AUC (95% CI) | Sensitivity | Specificity |
| Dataset 1 | SVM | 0.720 | 0.791 (0.771–0.811) | 0.710 | 0.731 | RF | 0.777 | 0.861 (0.841–0.871) | 0.720 | 0.840 | NBM | 0.677 | 0.741 (0.721–0.761) | 0.705 | 0.646 | ANN | 0.634 | 0.691 (0.671–0.711) | 0.687 | 0.576 | K-means | 0.611 | 0.681 (0.661–0.701) | 0.794 | 0.412 | Log | 0.610 | 0.701 (0.681–0.721) | 0.378 | 0.861 |
| Dataset 2 | SVM | 0.721 | 0.791 (0.781–0.811) | 0.722 | 0.721 | RF | 0.794 | 0.871 (0.851–0.881) | 0.756 | 0.832 | NBM | 0.682 | 0.771 (0.751–0.791) | 0.785 | 0.581 | ANN | 0.666 | 0.731 (0.711–0.751) | 0.595 | 0.738 | K-means | 0.602 | 0.681 (0.671–0.701) | 0.811 | 0.393 | Log | 0.606 | 0.701 (0.681–0.721) | 0.364 | 0.847 |
| Dataset 3 | SVM | 0.719 | 0.801 (0.781–0.811) | 0.695 | 0.743 | RF | 0.806 | 0.881 (0.871–0.901) | 0.765 | 0.846 | NBM | 0.674 | 0.791 (0.771–0.811) | 0.837 | 0.515 | ANN | 0.733 | 0.801 (0.791–0.821) | 0.741 | 0.726 | K-means | 0.612 | 0.711 (0.691–0.731) | 0.824 | 0.405 | Log | 0.633 | 0.701 (0.681–0.721) | 0.421 | 0.839 |
| Dataset 4 | SVM | 0.719 | 0.791 (0.781–0.811) | 0.678 | 0.763 | RF | 0.807 | 0.881 (0.871–0.891) | 0.743 | 0.875 | NBM | 0.626 | 0.741 (0.721–0.761) | 0.328 | 0.946 | ANN | 0.732 | 0.811 (0.801–0.831) | 0.730 | 0.734 | K-means | 0.626 | 0.721 (0.701–0.741) | 0.801 | 0.436 | Log | 0.611 | 0.701 (0.691–0.721) | 0.361 | 0.880 |
| Dataset 5 | SVM | 0.729 | 0.801 (0.781–0.811) | 0.685 | 0.773 | RF | 0.816 | 0.891 (0.871–0.901) | 0.751 | 0.882 | NBM | 0.622 | 0.741 (0.721–0.761) | 0.315 | 0.937 | ANN | 0.747 | 0.811 (0.801–0.831) | 0.730 | 0.763 | K-means | 0.609 | 0.701 (0.681–0.721) | 0.780 | 0.434 | Log | 0.623 | 0.691 (0.671–0.711) | 0.391 | 0.861 |
|
|
NBM: Naive Bayesian; SVM: Support Vector Machine; RF: Random Forest Tree; ANN: Artificial Neural Networks; Log: Logistic regression; Dataset 1: 20 weeks gestation; Dataset 2: 22 weeks gestation; Dataset 3: 24 weeks gestation; Dataset 4: 26 weeks gestation; Dataset 5: 27 weeks gestation. AUC: the area under the curve; CI: confidence interval.
|