Research Article
Parallel Cleaning Algorithm for Similar Duplicate Chinese Data Based on BERT
Table 2
Configuration parameters of cluster nodes.
| Host name | IP | Chip model | Number of cores | Running memory (GB) | Hard disk size |
| Master | 192.168.2.101 | Intel® Core™ i7-6700 CPU @3.40 GHz | 8 | 8 | 1,000 GB | slave1 | 192.168.2.103 | Intel® Xeon® CPU E5-1603 v3@2.80 GHz | 4 | 16 | 2,000 GB | slave2 | 192.168.2.102 | Intel® Core™ i3-2120 CPU @3.30 GHz | 4 | 8 | 500 GB | slave3 | 192.168.2.104 | Intel® Core™ i5-4590 CPU @3.30 GHz | 4 | 8 | 500 GB |
|
|