Research Article

Parallel Implementation of FEM Solver for Shared Memory Using OpenMP

Algorithm 2

Parallel Conjugate Gradient algorithm.
Input: (A, ,x, N, rtt1, rtr2, ptAp)
Output:x
(1)fork (A, ,x, N, rtt1, rtr2, ptAp) {shared variables}
(2)Th_id, Tt_id (Thread id and Total Thread by OS)
(3)St_RowIdEd_RowId ← 0 {Start and end ids}
(4)IfTh_idN % Tt_ththen
(5)St_RowIDTh_idN/Tt+Th_id
(6)Ed_RowIDSt_RowID+N/Tt_th+1
(7)else
(8)St_RowIDTh_idN/Tt_th+Tt_th
(9)Ed_RowIDSt_RowID+N/Tt_th
(10)end_if
(11)rtr1rtr2t_rtr ← 0
(12)foreachiSt_RowID.Ed_RowIDdo
(13) p[i] ← r[i] ← b[i]-
(14)t_rtr ← t_rtr + r[i] ∗ r[i]
(15)end for
(16)lock rtr2rtr2 + t_rtr unlock
(17)barrier
(18)while>Threshold do
(19)rtr1rtr2
(20)rtr2t_ptAp ← 0
(21)foreachiSt_RowID, Ed_RowIDdo
(22)  t_p'Apt_ptAp+p[i].
(23)end for
(24) lock ptApptAp+t_ptAp unlock
(25) barrier
(26)t_rtr ← 0
(27)foreachiSt_RowID, Ed_RowIDdo
(28)  x[i] ← x[i]+(rtr1/ptAp).p[i]
(29)  r[i] ← r[i]+(rtr1/ptAp).
(30)  t_rtrt_rtr+r[i] ∗ r [i]
(31)end for
(32)ptAp ← 0
(33) lock rtr2rtr2+t_rtr unlock
(34) barrier
(35)foreachiSt_RowID, Ed_RowIDdo
(36)  p[i] ← r[i]+(rtr2/rtr1).p[i]
(37)end for
(38)end while
(39)join