Research Article
An Efficient and Effective Model to Handle Missing Data in Classification
Table 1
Specifications of real-world datasets.
| Dataset name | Sample size | Variable number | Discrete variable number | Missing proportion | Imbalance |
| Breast Cancer Wisconsin [47] | 699 | 10 | 0 | 2.29 | 65.5 | Chronic kidney disease | 400 | 24 | 13 | 60.5 | 62.5 | Congressional voting records | 435 | 16 | 16 | 46.67 | 61.4 | Credit approval | 690 | 15 | 9 | 5.36 | 55.5 | Cylinder bands | 540 | 39 | 19 | 48.7 | 57.8 | Heart disease—ungarian | 294 | 13 | 7 | 99.66 | 63.9 | Hepatitis | 155 | 19 | 13 | 48.39 | 79.4 | Horse colic | 368 | 23 | 15 | 98.1 | 63 | Mammographic mass [48] | 961 | 5 | 2 | 13.63 | 53.7 | Ozone level detection | 2536 | 73 | 0 | 27.13 | 97.1 |
|
|