Research Article

A Computational Linguistic Approach for Gender Prediction Based on Vietnamese Names

Table 2

The detailed information of the GenderVN1.0 dataset.

 Original datasetFilter duplicated names

Male1,499,980582,997
Female1,530,988564,654
Total of names3,031,0201,147,651
Total of words10,272,1084,015,734