Research Article

Cross-Checking Multiple Data Sources Using Multiway Join in MapReduce

Table 1

Fat relations processing times (in milliseconds) on 8 compute nodes.

 Join 1: 2 out of 3 overlapJoin 3: 3 out of 4 overlapJoin 4:
3 out of 4
overlap
# of reducers
27641259162581

# of records24,00049084035108044602038230449706180051780
28,00063320046633047089061010612006768056420
32,000106806076720069491099940954309528060810
36,00016873001174030102656015957014542014586066490
40,00026207601852560148428026803024504029193074070