Research Article

A Cross-Modal Image and Text Retrieval Method Based on Efficient Feature Extraction and Interactive Learning CAE

Table 2

MAP (R = 50) values of different methods on three datasets.

DatasetsMAP values
MethodsImage queryText queryAverage

Flickr30KReference [15]0.2150.2370.226
Reference [28]0.3040.3120.328
Reference [22]0.2810.3350.308
The proposed method0.3380.3790.359

MSCOCOReference [15]0.1980.2640.231
Reference [28]0.2930.3190.301
Reference [22]0.2750.3180.297
The proposed method0.3240.3430.334

Pascal VOC 2007Reference [15]0.1920.1980.195
Reference [28]0.2790.2950.262
Reference [22]0.2510.2470.249
The proposed method0.3060.3110.309