Research Article

Improvement in Automatic Speech Recognition of South Asian Accent Using Transfer Learning of DeepSpeech2

Algorithm 1

Dataset preprocessing algorithm.
Read CSV of common voice dataset
Choose “Pakistani,” “Indian,” “Dutch,” and “Sri-Lankan” accent rows from country column
Select “filename” columns from CSV file
Create a dictionary for JSON file writing
for i in filename do
flacConvert(i)
SilenceRemover(i)
Duration(i)
Finding transcript with respect to audio file
for lineIndex in range(x) do
transcript = split[lineIndex]
Write JSON file
data = {“audio_filepath”: “1.FLAC,” “duration”: duration, “text”: transcript}
with open(‘json-other-train.txt,” “a”) as outfile:
json.dump(data, outfile)
end for
end for
flacConvert(i) (Convert mp3 -> FLAC)
Calculate audio file length
Calculate audio file sample rates
Calculate duration = audio file length/sample rate
Calculate loudness
Calculate peak amplitude
Splitting audio file into chunks with silence > 0.5 second (considering it silent if quieter than −16 dBFS)
for chunk in enumerate (chunks) do
List.Append(chunk)
end for