Research Article

PTF-SimCM: A Simple Contrastive Model with Polysemous Text Fusion for Visual Similarity Metric

Table 2

Hyperparameters set in experiments.

ParametersDescriptionPerformance comparisonAblation study

Initial learning rate0.050.05
Weight decay1e − 41e-4
Momentum0.90.9
Dimension of image view features20482048
Dimension of cross-modal embedding1024{512, 1024, 2048}
Dimension of metric embedding{64, 128}{64, 128, 256, 512}
Number of cross-model embedding2{1, 2, 3, 4}
Layers of multimodal projector44
Layers of predictor22