International Journal of Digital Multimedia Broadcasting / 2020 / Article / Tab 2 / Research Article
Identification of Weakly Pitch-Shifted Voice Based on Convolutional Neural Network Table 2 Detection performance of strongly pitch-shifted voice in binary classification.
Pitch shifting software Training dataset Testing dataset Detecting method [6 ] LFCC + GMM [8 ] MFCC + GMM Proposed Rate FAR Rate FAR Rate FAR Audition TIMIT TIMIT 99.86 0.02 99.88 0.02 99.54 0.10 TIMIT UME 97.60 1.10 98.06 1.19 95.89 1.52 UME TIMIT 99.52 0.36 98.58 0.02 97.51 1.45 UME UME 99.79 0.15 99.79 0.12 99.49 0.12 GoldWave TIMIT TIMIT 99.97 0.00 99.94 0.01 99.58 0.05 TIMIT UME 97.93 0.75 96.82 2.04 96.29 1.53 UME TIMIT 99.72 0.05 98.45 0.01 98.44 1.17 UME UME 99.87 0.02 99.70 0.07 99.12 0.36 Audacity TIMIT TIMIT 99.98 0.00 99.97 0.00 99.97 0.00 TIMIT UME 99.13 0.44 97.57 2.10 99.78 0.07 UME TIMIT 99.97 0.01 98.72 0.00 99.96 0.01 UME UME 99.97 0.00 99.95 0.00 99.84 0.11
Bold values represent the best performance under same circumstances (in same row) of the three methods. For criteria detection rate (Rate), higher is better. For criteria false alarm rate (FAR), lower is better.