| Dataset | Question # | Train question # | Dev question # | Test question # | Percentage of the train set |
| SQuAD2.0 | 151,054 | 130,319 | 11,873 | 8862 | 86.27% | SQuAD1.1 | 107,702 | 87,599 | 10,570 | 9533 | 81.33% | TQA | 26,260 | 15,154 | 5309 | 5797 | 57.71% | MovieQA | 21,406 | 14,166 | 2844 | 4396 | 66.18% | MCScript | 13,939 | 9731 | 1411 | 2797 | 69.81% | DREAM | 10,197 | 6116 | 2040 | 2041 | 59.98% | ARC-E | 5197 | 2251 | 570 | 2376 | 43.31% | WikiQA | 3047 | 2118 | 296 | 633 | 69.51% | ARC-C | 2590 | 1119 | 299 | 1172 | 43.20% | ProPara | 488 | 391 | 54 | 43 | 80.12% | Short span | 46,473 | 31,390 | 6774 | 8309 | 67.54% | Long span | 20,241 | 12,053 | 2376 | 5812 | 59.55% | Short cloze | 58,500 | 40,500 | 9000 | 9000 | 69.23% | Long cloze | 58,500 | 40,500 | 9000 | 9000 | 69.23% |
|
|