Research Article

An Adaptive Method Based on Multiscale Dilated Convolutional Network for Binaural Speech Source Localization

Table 1

Configuration of training and testing sets.

Training setTesting set

KEMAR HRIRsAnechoic HRIRsAnechoic HRIRs with headphone AKGK271 MK II
TIMIT speech recordings10 males and 10 femalesOther 3 males and 3 females
Source-to-sensor distance0.5 m, 1 m, 2 m, 3 m1 m, 1.5 m
Noise typesBabble, destroyerops and factory1White, m109 and f16
SNRs−20 dB: 15 : 25 dB−10 dB: 10 : 30 dB
Reverberation time None0.1 s, 0.3 s, 0.5 s
Direct-to-reverberant ratio (DRR)None−1.44 dB, −2.02 dB, −2.58 dB
Number of binaural mixtures52369 noise-free and noisy signals and 5819 for validation set936 for each kind of noise and SNR, and 1221 reverberant signals