Research Article
A Computational Linguistic Approach for Gender Prediction Based on Vietnamese Names
Table 2
The detailed information of the GenderVN1.0 dataset.
| ā | Original dataset | Filter duplicated names |
| Male | 1,499,980 | 582,997 | Female | 1,530,988 | 564,654 | Total of names | 3,031,020 | 1,147,651 | Total of words | 10,272,108 | 4,015,734 |
|
|