Research Article
[Retracted] Analyzing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets
Table 8
The number of contexts of each MRC dataset.
| Dataset | Context # | Train context # | Dev context # | Test context # | Unit of the context |
| CoQA | 8399 | 7199 | 500 | 700 | Passage | CLOTH | 7131 | 5513 | 805 | 813 | Passage | DREAM | 6444 | 3869 | 1288 | 1287 | Dialogue | Qangaroo-M | 2508 | 1620 | 342 | 546 | Passage | TQA | 1076 | 666 | 200 | 210 | Lesson | MovieQA | 548 | 362 | 77 | 109 | Movie | SQuAD1.1 | 536 | 442 | 48 | 46 | Article | SQuAD2.0 | 505 | 442 | 35 | 28 | Article | Short span | 20,519 | 15,265 | 2770 | 2484 | Passage | Long span | 10,484 | 7756 | 1163 | 1565 | Passage | Short cloze | 58,500 | 40,500 | 9000 | 9000 | Passage | Long cloze | 58,500 | 40,500 | 9000 | 9000 | Passage |
|
|