Research Article
Content Deduplication with Granularity Tweak Based on Base and Deviation for Large Text Dataset
Table 2
Document vector space (
).
| Document vector space () | Topic_1 | Topic_2 | Topic_3 |
| D1 | 0.40 | 0.00 | 0.00 | D2 | 0.97 | 0.00 | 0.00 | D3 | 0.00 | 0.81 | 0.00 | D4 | 0.00 | 0.78 | 0.00 | D5 | 0.00 | 0.00 | 0.82 | D6 | 0.00 | 0.00 | 0.82 | D7 | 0.89 | 0.00 | 0.00 | D8 | 0.00 | 0.63 | 0.00 | D9 | −4.46139E−17 | 7.64073E−16 | 4.36267E−16 |
|
|