Research Article

ASLNet: An Encoder-Decoder Architecture for Audio Splicing Detection and Localization

Table 1

Illustration of audio clips in each dataset.

DatasetLanguageDuration (seconds)Num. of audio clips
OriginalSplicedTotal

ENSet2sEnglish29,89815,17325,071
ENSet3sEnglish34,08919,78323,872
CNSet2sChinese244,72786,073130,800
CNSet3sChinese344,66985,865130,534