Research Article

Study on the Application of Improved Audio Recognition Technology Based on Deep Learning in Vocal Music Teaching

Table 3

Humming recognition neural network model.

Network layerOutput dimension (none indicates the number of audio)Number of parameters

Input_1(input layer)(None,1,160000)0
Melspectrogram_1(Mel spectrogram)(None,64,625,1)0
Batch_normalization_1(batch normalization)(None,64,625,1)4
Conv2d_1(convolutional layer)(None,64,625,21)210
Batch_normalization_2(batch normalization)(None,64,625,21)84
Activation_1(ReLU)(None,64,625,21)0
Max_pooling2d_1(None,32,313,21)0
Conv2d_2(convolutional layer)(None,32,313,21)3990
Batch_normalization_3(batch normalization)(None,32,313,21)84
Activation_2(ReLU)(None,32,313,21)0
Max_pooling2d_2(None,16,157,21)0
Conv2d_3(convolutional layer(None,16,157,21)3990
Batch_mormalization_4(None,16,157,21)84
Activation_3(None,16,157,21)0
Max_pooling2d_3(None,8,79,21)0
Conv2d_4(convolutional layer)(None,8,79,21)3990
Batch_normalization_5(None,8,79,21)84
Activation_4(ReLU)(None,8,79,21)0
Max_pooling2d_4(None,4,40,21)0
Reshape(reduce dimension)(None,160,21)0
Gur1(GRU)(None,160,41)7749
Gru2(GRU)(None,41)10209
Dropout(None,41)0
Dense_1(Softmax)(None,50)2100