Research Article

Acoustic Model with Multiple Lexicon Types for Indonesian Speech Recognition

Table 1

The methods in the literature, dataset used, and their performance [2630].

Related workMethodDatasetPerformance (%WER)

Enhancement of automatic speech recognition by deep neural networks [26]DNN-HMM, data augmentationThe 34 hours speech of English diverse dataset16.85%
Self-supervised speech enhancement for Arabic speech recognition in real-world environment [27]Denoising auto encoder, HMMThe Arabic mobile parallel speech multi-dialect speech corpus30.17%
Effect of pitch enhancement in Punjabi children’s speech recognition system under disparate acoustic conditions [28]Pitch enhancement, DNN-HMMThe Punjabi adult/child speech dataset10.98%∼12.24%
A hybrid speech enhancement algorithm for voice assistance application [29]Noise suppression, HMMThe 8.5 hours English medical speech dataset (RAVDESS)17.5%∼22.9%
Dual application of speech enhancement for automatic speech recognition [30]RNN transducer, data augmentationThe social media English video dataset8.3%∼13.4%