Research Article
A Voice Cloning Method Based on the Improved HiFi-GAN Model
Table 7
SMOS of cloning speech similarity of different models.
| Metric | Settings | LibriSpeech | VCTK | THchs-30 |
| SMOS (CI) | Multispeaker TTS | 3.56 0.07 | 3.18 0.06 | 3.25 0.08 | Multispeaker TTS + x-vector | 3.91 0.06 | 3.44 0.07 | 3.59 0.06 | WaveGlow + d-vector | 3.55 0.09 | 3. 0.09 | 3.32 0.07 | WaveGlow + x-vector | 3.89 0.08 | 3.47 0.09 | 3.64 0.05 | HiFi-GAN + d-vector | 3.82 0.05 | 3.38 0.07 | 3.43 0.09 | HiFi-GAN + x-vector | 4.15 0.07 | 3.61 0.08 | 3.68 0.08 | Improved HiFi-GAN + d-vector | 3.99 0.10 | 3.52 0.06 | 3.61 0.05 | Improved HiFi-GAN + x-vector | 4.23 0.06 | 3.80 0.08 | 3.84 0.07 |
|
|