Research Article

Multitask Learning with Local Attention for Tibetan Speech Recognition

Table 4

Speaker ID recognition accuracy (%) of two-task models.

ArchitectureModelLhasa-Ü-TsangChangdu-KhamAmdo Pastoral

SpeakerID model67.7593.1395.31
WaveNet-CTC with speaker IDS-S168.3292.8597.48
S-S271.1595.2396.12

Attention (5)-WaveNet-CTCS-S1000
S-S260.6477.3885.85

WaveNet-Attention (5)-CTCS-S170.3592.8597.48
S-S269.4010096.70