Research Article
Cross-Checking Multiple Data Sources Using Multiway Join in MapReduce
Table 1
Fat relations processing times (in milliseconds) on 8 compute nodes.
| ā | Join 1: 2 out of 3 overlap | Join 3: 3 out of 4 overlap | Join 4: 3 out of 4 overlap | # of reducers | 27 | 64 | 125 | 9 | 16 | 25 | 81 |
| # of records | 24,000 | 490840 | 351080 | 446020 | 38230 | 44970 | 61800 | 51780 | 28,000 | 633200 | 466330 | 470890 | 61010 | 61200 | 67680 | 56420 | 32,000 | 1068060 | 767200 | 694910 | 99940 | 95430 | 95280 | 60810 | 36,000 | 1687300 | 1174030 | 1026560 | 159570 | 145420 | 145860 | 66490 | 40,000 | 2620760 | 1852560 | 1484280 | 268030 | 245040 | 291930 | 74070 |
|
|