Research Article

Optimizing Hadoop Performance for Big Data Analytics in Smart Grid

Table 6

GEP recommended configuration parameter settings.

Configuration parametersOptimized values

Number of data samples in million17.2834.5651.8469.1286.40
io.sort.factor1010131912
io.sort.mb3813586255
io.sort.spill.percent0.900.900.900.890.90
mapred.reduce.tasks141413216
mapreduce.tasktracker.map.tasks.maximum86888
mapreduce.tasktracker.reduce.tasks.maximum11121
mapred.child.java.opts169135121124121
mapreduce.reduce.shuffle.input.buffer.percent0.730.650.650.650.75
mapred.inmem.merge.threshold200200201202201