Research Article

CCAH: A CLIP-Based Cycle Alignment Hashing Method for Unsupervised Vision-Text Retrieval

Figure 3

We use CLIP image encoder for the images, with the left side representing the original image and the right side representing the results of attention visualization for different levels of features.