Research Article

Machine Learning-Based Prediction Model of Preterm Birth Using Electronic Health Record

Table 3

The performance of models in the test set.

ModelsAccuracyAUC (95% CI)SensitivitySpecificity

Dataset 1SVM0.7200.791 (0.771–0.811)0.7100.731
RF0.7770.861 (0.841–0.871)0.7200.840
NBM0.6770.741 (0.721–0.761)0.7050.646
ANN0.6340.691 (0.671–0.711)0.6870.576
K-means0.6110.681 (0.661–0.701)0.7940.412
Log0.6100.701 (0.681–0.721)0.3780.861

Dataset 2SVM0.7210.791 (0.781–0.811)0.7220.721
RF0.7940.871 (0.851–0.881)0.7560.832
NBM0.6820.771 (0.751–0.791)0.7850.581
ANN0.6660.731 (0.711–0.751)0.5950.738
K-means0.6020.681 (0.671–0.701)0.8110.393
Log0.6060.701 (0.681–0.721)0.3640.847

Dataset 3SVM0.7190.801 (0.781–0.811)0.6950.743
RF0.8060.881 (0.871–0.901)0.7650.846
NBM0.6740.791 (0.771–0.811)0.8370.515
ANN0.7330.801 (0.791–0.821)0.7410.726
K-means0.6120.711 (0.691–0.731)0.8240.405
Log0.6330.701 (0.681–0.721)0.4210.839

Dataset 4SVM0.7190.791 (0.781–0.811)0.6780.763
RF0.8070.881 (0.871–0.891)0.7430.875
NBM0.6260.741 (0.721–0.761)0.3280.946
ANN0.7320.811 (0.801–0.831)0.7300.734
K-means0.6260.721 (0.701–0.741)0.8010.436
Log0.6110.701 (0.691–0.721)0.3610.880

Dataset 5SVM0.7290.801 (0.781–0.811)0.6850.773
RF0.8160.891 (0.871–0.901)0.7510.882
NBM0.6220.741 (0.721–0.761)0.3150.937
ANN0.7470.811 (0.801–0.831)0.7300.763
K-means0.6090.701 (0.681–0.721)0.7800.434
Log0.6230.691 (0.671–0.711)0.3910.861

NBM: Naive Bayesian; SVM: Support Vector Machine; RF: Random Forest Tree; ANN: Artificial Neural Networks; Log: Logistic regression; Dataset 1: 20 weeks gestation; Dataset 2: 22 weeks gestation; Dataset 3: 24 weeks gestation; Dataset 4: 26 weeks gestation; Dataset 5: 27 weeks gestation. AUC: the area under the curve; CI: confidence interval.