Research Article
An Adaptive Method Based on Multiscale Dilated Convolutional Network for Binaural Speech Source Localization
Table 1
Configuration of training and testing sets.
| | Training set | Testing set |
| KEMAR HRIRs | Anechoic HRIRs | Anechoic HRIRs with headphone AKGK271 MK II | TIMIT speech recordings | 10 males and 10 females | Other 3 males and 3 females | Source-to-sensor distance | 0.5 m, 1 m, 2 m, 3 m | 1 m, 1.5 m | Noise types | Babble, destroyerops and factory1 | White, m109 and f16 | SNRs | −20 dB: 15 : 25 dB | −10 dB: 10 : 30 dB | Reverberation time | None | 0.1 s, 0.3 s, 0.5 s | Direct-to-reverberant ratio (DRR) | None | −1.44 dB, −2.02 dB, −2.58 dB | Number of binaural mixtures | 52369 noise-free and noisy signals and 5819 for validation set | 936 for each kind of noise and SNR, and 1221 reverberant signals |
|
|