Research Article
An Improved Transformer-Based Neural Machine Translation Strategy: Interacting-Head Attention
Table 1
Number of sequence pairs and sequence length in IWSLT16 DE-EN, WMT17 EN-DE, and WMT17 EN-CS.
| 
 | |||||||||||||||||||||||||||||||||||||||||||||||||
| Note. Train, Eval, and Test represent the number of sequence pairs of different data subsets, respectively; length refers to the number of tokens in a sentence; total length of the train set is the total number of tokens in the training set; mean length of the train set is the ratio of the total length to the total sequence pairs in the training set. | |||||||||||||||||||||||||||||||||||||||||||||||||