Research Article

Improving Transformer-Based Neural Machine Translation with Prior Alignments

Algorithm 2

Procedure to construct statistical alignments .
(1)We tokenize both Vietnamese source sentences and English target sentences into words
(2)We replace English words with their lemmas
(3)We construct many-to-one alignments from Vietnamese words to English lemmas, using the fast_align token aligner
(4)We repeat step 2 in the reverse direction from English lemmas to Vietnamese words
(5)We merge the bidirectional alignments generated in steps 2 and 3, following grow-diagonal heuristics proposed by Koehn et al. [24]