Research Article

Performance Analysis of Homogeneous On-Chip Large-Scale Parallel Computing Architectures for Data-Parallel Applications

Figure 2

The subprogram running on a processor node is abstracted as a set of subtasks and communications.