Research Article
An Adaptive Method Based on Multiscale Dilated Convolutional Network for Binaural Speech Source Localization
Table 3
Localization accuracy (%) of different approaches in the noisy and reverberant scenes.
| RT60/DRR | — | 0.1 s/−1.44 dB | 0.3 s/−2.02 dB | 0.5 s/−2.58 dB | Noise/SNR | Avg. | -/- | White/15 dB | -/- | White/15 dB | -/- | White/15 dB |
| MLP [8] | 28.87 | 43.24 | 24.46 | 33.42 | 24.19 | 23.84 | 24.05 | DNN [19] | 67.69 | 92.14 | 78.11 | 74.94 | 53.51 | 63.81 | 43.65 | Regular CNN | 61.40 | 85.26 | 79.73 | 58.23 | 52.16 | 49.40 | 43.65 | Dilation-2 CNN | 57.69 | 77.15 | 75.41 | 56.02 | 50.14 | 43.74 | 43.65 | Dilation-5 CNN | 84.03 | 94.59 | 89.46 | 92.14 | 75.95 | 86.62 | 65.41 | Cascaded DCNN | 73.16 | 91.15 | 77.84 | 84.52 | 56.62 | 79.25 | 49.59 | Ours | 78.86 | 93.12 | 87.97 | 83.78 | 71.08 | 76.50 | 60.68 | Ours | 83.48 | 94.59 | 89.05 | 90.66 | 77.70 | 85.08 | 63.81 |
|
|