Research Article
A Fast Fully Parallel Ant Colony Optimization Algorithm Based on CUDA for Solving TSP
Figure 3
Architecture of SM. SM is the unit that conducts specific operations in the form of instructions that constitute a kernel function. It consists of an instructions cache for storing instructions, several tensor cores for dispatching and executing instructions under the controller of each tensor core, the texture memory and the shared memory. Each tensor core has its controller contains the wrap scheduler and the dispatch unit, register, and plenty of CUDA cores.