Computational Intelligence and Neuroscience

Research Article

Deep Learning-Based Classification of Spoken English Digits

The DFNN Classification Model.

(1)	procedure Deep Feedforward Neural Network Classifier () (⊳) contains the STFT features of each audio sample, while contains the target audio class label
(2)	Reading the dataset using the library “Librosa”
(3)	Extract STFT features from the audio
(4)	One hot encode the audio data to produce the class label.
(5)	Split the dataset into training and testing set with STFT features as the input, and audio class as the target label.
(6)	Start the DFNN model
(7)	Epoch = N; audio = first audio
(8)	for : do
(9)	First_Layer = Dense (first audio, input_dim = 1025, output_dim = 256)
(10)	Second_Layer = Dense (input_dim = 256, output_dim = 128)
(11)	Third_Layer = Dense (Input_dim = 128, output_dim = 128)
(12)	Fourth_Layer = Dense (input_dim = 128, output_dim = 128)
(13)	Output_Layer = Dense (input_dim = 128, output_dim = 10)
(14)	if Output_Layer = = the target_Layer then
(15)	audio = next audio
(16)	end if
(17)	end for
(18)	end procedure