Research Article

Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU

Algorithm 5

Multiple scalar-style reduction.
    
()    ;
()    ;
    //Perform a multiple scalar-style reduction from temp_
()    ;
()    ;   & ();
()    if then
     //Perform a partial reduction from temp_
()     ;
()     ;
()     ;
()     for to with   +=   do
()     +=  ;
()  done
()  ;
()  ();
     //Perform a warp reduction from bVAL_s
()  if   &&  
()    += ;
()   ();
     
()  if   &&    >=  16
()    += ; ();
()  if   &&    >= 8
()    += ; ();
()  if   &&  
()    += ; ();
()  if   &&  
()   ;
() end