| Mapper |
| Step 1: Take D is an m-point data set |
| Step 2: To create a distance-vector V, first estimate the Euclidean distance for every data point (zi) to all data points (using equation (3)) |
| Step 3: Compute the average distance (R) by equation (4) |
| Step 4: Set neighbor_count to zero |
| Step 5: For every distance d |
| If (distance < R) |
| { |
| neighbor_count = neighbor_count + 1; |
| } |
| Step 6: end if |
| Step 7: end for |
| Step 8: Determine the threshold (T) values |
| Step 9: Identify data points in the highly dense area. If possible, put it in a highly dense area. Alternately, put it in a low dense area. |
| If (neighbor_count ≥ T) |
| { |
| Use the distance value’s index as key and then save it in <key, value> structure like HD_set (1, key) |
| } |
| Else |
| LD_set (2, key) |
| Step 10: end if |
| Reducer |
| Step 1: Gather the mapper function’s results like HD_set (1, list<values>) |
| Step 2: Choose K initial centers (1, list<values>) |
| Step 3: Initialize S[k] using these points |
| Step 4: Set min_distance to the maximum value. |
| Step 5: For i = 0 to Sk.length |
| Distance_estimate = determine distance (d, S[i]) |
| If (Distance_estimate < min_distance) |
| { |
| min_distance = Distance_estimate; index = i; |
| } |
| Step 6: end for |
| Step 7: Consider the index as KEY and the matching values as VAL |
| Step 8: Cluster outcomes as (KEY, list<VAL>) |
| Step 9: End |