Research Article

Spoken Language Identification Using Deep Learning

Table 8

Performance measures obtained from various language identification techniques.

ModelsDataset usedTrain acc.Test acc.Validation acc.SensitivitySpecificityF1 scorePrecisionRecall

CNNSpoken language identification10098.6398.9098.909999.9099.9099.90
Naïve BayesLanguage identification dataset9494931009993.6094.9093.60
Word embeddingLanguage identification dataset95.5092.2093.20989992.5093.5092.50
Logistic regressionSpoken language identification64.3261.9563.5564.2363.5564.2364.2364.23
Naïve BayesSpoken language identification50.2549.7551.2052.3251.7851.2051.2051.20
SVMCommon voice Kaggle82.8883.3282.6575.9575.5576.9576.9576.95
Random forest classifierCommon voice Kaggle72.4271.9071.5066.3267.2467.2367.2367.23
VGG16Mozilla common voice81.3080.2181.0581.2580.5279.8380.0280.02
ResNet50Spoken language identification86.3080.2084.3285.3684.2384.6582.7582.75
CapsNet [20]Spoken language identification91.8088.7690.728889898989
2D ConvNet bidirectional GRU [14]Spoken language identification68.8565.2367.826866666666
Acoustic model [25]Spoken language identification75.6973.2374.237575757575
CNN LSTM [19]Spoken language identification83.2580.5282.458382828282
Logistic regression I-vector [26]Mozilla common voice84.3080.2382.5278.9580.0582.3682.3682.36
LSTM-CNN [29]Common voice Kaggle70.2168.3369.9667.2368.3369.5469.5469.54