Research Article
HRNet Encoder and Dual-Branch Decoder Framework-Based Scene Text Recognition Model
Table 2
Comparison of accuracy of ablation models (%).
| Model | IIIT5k | SVT | IC03 | IC13 | IC15 | SVTP | CUTE80 |
| Baseline (HRNet) | 91.7 | 88.4 | 93.4 | 92.2 | 78.6 | 80.2 | 80.9 | Baseline + SR (Bilinear Interpolation) | 93.0 | 89.5 | 92.7 | 92.7 | 81.1 | 81.1 | 78.1 | Baseline + SR (Bilinear Interpolation) + SAM | 93.0 | 92.1 | 91.9 | 93.2 | 81.7 | 83.3 | 81.2 | Baseline + SR (Trans Conv2D) + SAM | 93.4 | 91.8 | 93.3 | 93.6 | 81.8 | 82.6 | 81.6 | Proposed model | 93.7 | 91.3 | 93.3 | 94.3 | 82.8 | 83.1 | 83.0 |
|
|