Research Article

Acoustic Model with Multiple Lexicon Types for Indonesian Speech Recognition

Table 3

The Indonesian audio dataset before validation.

CategoryOriginal audioUtterances
Number of audioTotal duration (hours)Number of utterancesTotal duration (hours)

Autos and vehicles202.80392,2341.3289
Comedy407.71256,0174.0768
Education55354.950847,47738.4886
Entertainment30857.174739,15525.2972
Film and animation777.74646, 4754.5591
Gaming10.11501240.0987
Howto and style35554.511728,72641.8189
Music564.64691, 4092.5783
News and politics17024.600319,29516.8386
People and blogs17950.257832,20522.7193
Pets and animals20.3153570.0433
Science and technology21544.530831,49222.9787
Sports10.15831370.1131
Travel and events10.2236780.0633
Uncategorized20.33674100.2515
Total1,980310.0847215,291181.2543