Research Article
End-to-End Speech Synthesis for Tibetan Multidialect
Table 4
The MOS comparison of speech synthesized by different models.
| Model | MOS of Lhasa-Ü-Tsang dialect | MOS of Amdo pastoral dialect |
| Linear predictive amplitude spectrum + Griffin–Lim | 3.30 | 3.52 | Mel spectrogram + Griffin–Lim | 3.55 | 3.70 | Mel spectrogram + WaveNet | 3.95 | 4.18 |
|
|