Research Article
A Cross-Modal Image and Text Retrieval Method Based on Efficient Feature Extraction and Interactive Learning CAE
Table 2
MAP (Rā=ā50) values of different methods on three datasets.
| Datasets | MAP values | Methods | Image query | Text query | Average |
| Flickr30K | Reference [15] | 0.215 | 0.237 | 0.226 | Reference [28] | 0.304 | 0.312 | 0.328 | Reference [22] | 0.281 | 0.335 | 0.308 | The proposed method | 0.338 | 0.379 | 0.359 |
| MSCOCO | Reference [15] | 0.198 | 0.264 | 0.231 | Reference [28] | 0.293 | 0.319 | 0.301 | Reference [22] | 0.275 | 0.318 | 0.297 | The proposed method | 0.324 | 0.343 | 0.334 |
| Pascal VOC 2007 | Reference [15] | 0.192 | 0.198 | 0.195 | Reference [28] | 0.279 | 0.295 | 0.262 | Reference [22] | 0.251 | 0.247 | 0.249 | The proposed method | 0.306 | 0.311 | 0.309 |
|
|