Research Article
A Heterogeneous Parallel LU Factorization Algorithm Based on a Basic Column Block Uniform Allocation Strategy
Algorithm 1
Heterogeneous parallel LU factorization algorithm based on a basic column block uniform allocation strategy for a multiple CPU/GPU system.
)forāāk=1,,nāādo | )if (process_id=z) then | )dgetrf() or gpu_dgetrf() | )end if | )broadcast() | )if (process_id=z) then | )dlaswp(left(), right(), ) or cublasDswap(left(), right(), ) | )else | )dlaswp() or cublasDswap() | )end if | )dtrsm(, right()) or cublasDtrsm(, right()) | )dgemm(, right(), rest()) or cublasDgemm(, right(), rest()) | )end for |
|