Research Article
[Retracted] Analyzing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets
Table 3
Statistics of the proposed span extraction datasets.
| | Short span extraction | Long span extraction | Train set | Dev set | Test set | Train set | Dev set | Test set |
| Paragraph # | 600 | 150 | 200 | 600 | 150 | 200 | Passage # | 15,265 | 2770 | 2484 | 7756 | 1163 | 1565 | Question # | 31,390 | 6774 | 8309 | 12,053 | 2376 | 5812 | Max tokens in a context # | 512 | 512 | 512 | 512 | 512 | 512 | Max answer tokens # | 6 | 6 | 6 | 9 | 9 | 9 | Min answer tokens # | 4 | 4 | 4 | 7 | 7 | 7 |
|
|