Research Article
Improving Transformer-Based Neural Machine Translation with Prior Alignments
Algorithm 2
Procedure to construct statistical alignments
.| (1) | We tokenize both Vietnamese source sentences and English target sentences into words | | (2) | We replace English words with their lemmas | | (3) | We construct many-to-one alignments from Vietnamese words to English lemmas, using the fast_align token aligner | | (4) | We repeat step 2 in the reverse direction from English lemmas to Vietnamese words | | (5) | We merge the bidirectional alignments generated in steps 2 and 3, following grow-diagonal heuristics proposed by Koehn et al. [24] |
|