Review Article

Accuracy of Deep Learning Algorithms for the Diagnosis of Retinopathy of Prematurity by Fundus Images: A Systematic Review and Meta-Analysis

Table 1

Characteristics of nine studies for the systematic review and meta-analysis.

General characteristicsDataset characteristicsDefinition and grade of ROP
AuthorYear, data sourceCameraReference standardDatasetIdentification and grade

Brown et al. [24]2018, i-ROPRetCamRSD, images and clinical diagnosis5511 images4535 N, 805 pre and 172 plus
Wang et al. [25]2018, hospital and webRetCam 3ICROP, CRYO-ROP, and ETROP3722 cases2823 N and 899 ROP; 382 Min and 295 S
Hu et al. [26]2019, hospitalRetCam 3Consistent label2668 images1484 N and 1184 ROP; 382 Mil and 295 S
Tan et al. [27]2019, ART-ROPRetCamImages and clinical diagnosis6974 images5336 N and 1638 plus
Wang et al. [28]2019, hospitalNRConsistent label11000 images7559 N and 3441 ROP; 529 Mil and 1204 S
Zhang et al. [29]2019, hospitalRetCam 2/3The same criteria19543 images11298 N and 8245 ROP
Huang et al. [30]2020, hospitalRetCamICROP + consistent label18808 images1222 N and 1129 ROP; 1189 Mil and 1174 S
Ramachandran et al. [31]2021, KIDROPRetCam 3Consistent label289 infants200 N and 89 plus
Wang et al. [32]2021, hospitalRetCam 2/3Consistent label52249 images6363 any stage and 42177 N; 885 pre or plus and 17223 N

DL model characteristics
AuthorNeural networkAlgorithm evaluationClassification

Brown et al. [24]CNN: U-Net and Inception V1The 5-fold cross-validationN/pre and plus
Plus/N and pre
Wang et al. [25]DNN: Id-Net and Gr-NetNRN/ROP
Min/S
Hu et al. [26]CNN: a pretrained ImageNet (VGG16, inception V2, and ResNet-50)Select the best module and image sizeN/ROP
Mil/S
Tan et al. [27]CNN: Inception V3NRN/plus
Wang et al. [28]CNN: a pretrained ImageNet (Inception V2, Inception V3, and ResNet-50)Select the best moduleN/ROP
Mil/S
Zhang et al. [29]DNN: AlexNet, VGG16, and GoogLeNetSelect the best moduleN/ROP
Huang et al. [30]DNN: VGG16, VGG19, MobileNet, InceptionV3, and DenseNetSelect the best module and then 5-fold cross-validationN/ROP
Mil/S
Ramachandran et al. [31]CNN: a pretrained ImageNet (Darknet-53 network)Select the best moduleN/plus
Wang et al. [32]CNN: ResNet18, DenseNet121, and EfficientNetB2Five independent classifiers validationPreplus plus/non
Any stage/non
Accuracy values
AuthorNegative vs. positiveTDVDACCSNSPAUCTEDACCSNSPAUC

Brown et al. [24]N vs. pre and plus80%20%NRNRNR0.94100 (from the same set with TD)0.910.930.94NR
N and pre vs. plus80%20%NRNRNR0.9810.94NR
Wang et al. [25]N vs. ROP2226298NR0.96640.99330.9949944 (from web)NR0.84910.9690NR
Min vs. S2004104NR0.88460.92310.9508106 (from web)NR0.9330.736NR
Hu et al. [26]N vs. ROP20683000.970.960.980.9922406 (from the same set with TD)NR0.9000.989NR
Mil vs. S4661000.840.820.860.921231 (from ROP in TED)NR0.9440.923NR
Tan et al. [27]N vs. plus557913950.9730.9660.980.99390 (external set)0.8560.9390.807NR
Wang et al. [28]N vs. ROP850712280.9270.8999NRNR1265 (from TD)NRNRNRNR
Mil vs. S11752690.7850.9235NRNR289 (from ROP in TED)NRNRNRNR
Zhang et al. [29]N vs. ROP1780117420.9880.9350.9950.9981742 (from the same set with TD)0.9880.9350.9950.998
Huang et al. [30]N vs. ROP2351368 casesNRAverage 0.911Average 0.992NR101 (from the same set with TD)0.960.9660.9520.97
Mil vs. S2363339 casesNRAverage 0.987Average 0.985NR85 (from ROP in TED)0.98810.9840.99
Ramachandran et al. [31]N vs. plusAbout 80%About 20%0.990.990.980.99471610 (from the same set with TD)NR0.980.98NR
Wang et al. [32]Non vs. any stage362354813NR0.9720.9840.99777492 (from the same set with TD)NR0.9820.9850.9981
Non vs. preplus and plus135241866NR0.9090.9840.98822718 (from the same set with TD)NR0.9180.970.9827

ROP, retinopathy of prematurity. Reference Standard. Based on images: RSD, a reference standard diagnosis; ICROP, International Classification of ROP, and based on both images and clinical information: CRYO-ROP, Cryotherapy for Retinopathy of Prematurity; ETROP, early treatment ROP; N, normal, pre, preplus disease; plus, plus disease; Min, minor; Mil, mild; S, severe; i-ROP, Imaging and Informatics in Retinopathy of Prematurity; ART-ROP, Auckland Regional Telemedicine ROP image library; KIDROP, Karnataka Internet assisted diagnosis of ROP program; DL, deep learning; CNN, convolutional neural network; DNN, deep neural network; DCNN, deep convolutional neural network; TD, training dataset; VD, validation dataset; TED, test dataset. Total data set includes TD, VD, and TED; ACC, accuracy; SN, sensitivity; SP, specificity; AUC, area under the receiver operating curve; NR, not reported.