Research Article

Medical Image Description Based on Multimodal Auxiliary Signals and Transformer

Table 1

Performance of different methods on the IU X-ray and COV-CTR datasets.

DatasetsMethodsBLEU1BLEU2BLEU3BLEU4METEORROUGE_LCIDEr

IU-X-rayTransformer0.4220.2640.1770.1200.1640.3380.421
CoAtt0.4550.2880.2050.1540.3690.277
HRGR-Agent0.4380.2980.2080.1510.3220.343
PPKED0.4830.3150.2240.1680.1900.3760.351
KERP0.4820.3250.2260.1620.1870.3390.280
M2Transformer0.4630.3180.2140.1550.1920.335
ASGMD0.4890.3260.2320.1730.2060.397
R2Gen (base)0.4700.3040.2190.1650.1870.3710.398
MDAK (our)0.4800.3280.2310.1720.2010.3690.424
MDAKF (our)0.4940.3180.2290.1740.1940.3890.371

COV-CTRCoAtt0.7090.6450.6030.5520.748
SAT0.6970.6210.5680.5150.723
ASGK0.7120.6590.6110.5700.746
AdaAtt0.6760.6330.5960.5140.726
R2Gen0.7250.6410.5800.5280.3990.6771.358
MDAK (our)0.7230.6520.5860.5450.4030.6761.452
MDAKF (our)0.7260.6510.5830.5390.4010.6831.354

The bold values indicate that the model performance of the algorithm is optimal in a certain type of dataset.