Research Article
Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU
Algorithm 4
Main procedure of generating row blocks.
| Input: , , , ; | | Output: , ; | | () ; ; ; ; | | () for to do | | //Compute non-zeros and the total rows | | () += ; | | () ++; | | () if ∥ | | ( && ) then | | //This row fills up SHARED_SIZE or threads per block | | () ; ++; ; ; | | () else if then | | //This row is an extra one that is excluded | | () ; ++; ; ; −−; | | () end | | () done | | //Extra case | | () if != then | | () ; | | () else | | () −−; | | () end |
|