Research Article
Improving Transformer-Based Neural Machine Translation with Prior Alignments
Algorithm 2
Procedure to construct statistical alignments
.(1) | We tokenize both Vietnamese source sentences and English target sentences into words | (2) | We replace English words with their lemmas | (3) | We construct many-to-one alignments from Vietnamese words to English lemmas, using the fast_align token aligner | (4) | We repeat step 2 in the reverse direction from English lemmas to Vietnamese words | (5) | We merge the bidirectional alignments generated in steps 2 and 3, following grow-diagonal heuristics proposed by Koehn et al. [24] |
|