Research Article
CCAH: A CLIP-Based Cycle Alignment Hashing Method for Unsupervised Vision-Text Retrieval
Figure 3
We use CLIP image encoder for the images, with the left side representing the original image and the right side representing the results of attention visualization for different levels of features.