Research Article
PF-ViT: Parallel and Fast Vision Transformer for Offline Handwritten Chinese Character Recognition
Table 4
Performance of different models on the DHWDB dataset: parameters; FLOPs; accuracy.
| Methods | Number of encoder layers per channel | Epochs | #Params (M) | FLOPs (G) | Acc. (%) |
| T-ViT | 3 | 300 | 43.11 | 4.32 | 98.1 | 4 | 300 | 57.28 | 5.72 | 98.3 | 6 | 300 | 85.62 | 8.52 | 98.6 |
| F-ViT | 2 | 300 | 57.28 | 2.94 | 96.6 | 3 | 300 | 85.62 | 4.36 | 97.3 | 6 | 300 | 170.63 | 8.61 | 97.7 |
| S-ViT | 2 | 300 | 99.79 | 2.99 | 96.3 | 3 | 300 | 148.38 | 4.43 | 97.1 | 4 | 300 | 198.98 | 5.86 | 97.0 |
|
|