Research Article
Acoustic Model with Multiple Lexicon Types for Indonesian Speech Recognition
Table 3
The Indonesian audio dataset before validation.
| Category | Original audio | Utterances | Number of audio | Total duration (hours) | Number of utterances | Total duration (hours) |
| Autos and vehicles | 20 | 2.8039 | 2,234 | 1.3289 | Comedy | 40 | 7.7125 | 6,017 | 4.0768 | Education | 553 | 54.9508 | 47,477 | 38.4886 | Entertainment | 308 | 57.1747 | 39,155 | 25.2972 | Film and animation | 77 | 7.7464 | 6, 475 | 4.5591 | Gaming | 1 | 0.1150 | 124 | 0.0987 | Howto and style | 355 | 54.5117 | 28,726 | 41.8189 | Music | 56 | 4.6469 | 1, 409 | 2.5783 | News and politics | 170 | 24.6003 | 19,295 | 16.8386 | People and blogs | 179 | 50.2578 | 32,205 | 22.7193 | Pets and animals | 2 | 0.3153 | 57 | 0.0433 | Science and technology | 215 | 44.5308 | 31,492 | 22.9787 | Sports | 1 | 0.1583 | 137 | 0.1131 | Travel and events | 1 | 0.2236 | 78 | 0.0633 | Uncategorized | 2 | 0.3367 | 410 | 0.2515 | Total | 1,980 | 310.0847 | 215,291 | 181.2543 |
|
|