Research Article
Random Forests in Count Data Modelling: An Analysis of the Influence of Data Features and Overdispersion on Regression Performance
Table 4
The impact of overdispersion and data features on the minimal terminal node size tuning.
| Data types | Variance-to-mean relationship | Node size | N = 50 (%) | N = 250 (%) | N = 1250 (%) | | | | | | | | | | | | | | | | | | | | | | | | |
| Categorical | Linear | | 25 | 26 | 18 | 32 | 31 | 34 | 5 | 11 | 2 | 27 | 26 | 25 | 0 | 0 | 0 | 18 | 15 | 17 | | 12 | 10 | 16 | 29 | 31 | 26 | 31 | 21 | 28 | 47 | 48 | 39 | 22 | 16 | 28 | 43 | 56 | 52 | | 63 | 64 | 66 | 39 | 38 | 40 | 64 | 68 | 70 | 26 | 26 | 36 | 78 | 84 | 72 | 39 | 29 | 31 | Quadratic | | 22 | 16 | 19 | 31 | 27 | 31 | 4 | 6 | 5 | 30 | 18 | 18 | 1 | 0 | 0 | 15 | 10 | 10 | | 15 | 17 | 11 | 26 | 22 | 14 | 22 | 16 | 25 | 33 | 52 | 45 | 14 | 16 | 10 | 49 | 57 | 51 | | 63 | 67 | 70 | 43 | 51 | 55 | 74 | 78 | 70 | 37 | 30 | 37 | 85 | 84 | 90 | 36 | 33 | 39 |
| 25% of predictors are quantitative | Linear | | 100 | 0 | 0 | 100 | 100 | 0 | 0 | 0 | 0 | 0 | 100 | 100 | 0 | 0 | 0 | 100 | 0 | 0 | | 0 | 0 | 100 | 0 | 0 | 100 | 100 | 0 | 0 | 100 | 0 | 0 | 0 | 100 | 100 | 0 | 100 | 100 | | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 100 | 100 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | Quadratic | | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | | 0 | 100 | 0 | 0 | 100 | 0 | 100 | 100 | 0 | 0 | 0 | 100 | 0 | 100 | 0 | 0 | 0 | 0 | | 100 | 0 | 100 | 100 | 0 | 0 | 0 | 0 | 100 | 100 | 100 | 0 | 100 | 0 | 100 | 0 | 100 | 100 |
| 50% of predictors are quantitative | Linear | | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 100 | 0 | 0 | | 0 | 0 | 100 | 100 | 0 | 0 | 100 | 0 | 0 | 100 | 0 | 0 | 100 | 100 | 0 | 0 | 100 | 100 | | 100 | 100 | 0 | 0 | 100 | 0 | 0 | 100 | 100 | 0 | 100 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | Quadratic | | 0 | 100 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 100 | 100 | 0 | 0 | 0 | 0 | 100 | 0 | | 100 | 0 | 0 | 100 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 100 | 100 | 0 | 0 | | 0 | 0 | 100 | 0 | 100 | 0 | 0 | 100 | 100 | 100 | 0 | 0 | 100 | 100 | 0 | 0 | 0 | 100 |
| 75% of predictors are quantitative | Linear | | 100 | 0 | 100 | 0 | 0 | 100 | 100 | 100 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 100 | 100 | 100 | 100 | 0 | 0 | 100 | 100 | | 0 | 100 | 0 | 100 | 100 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | Quadratic | | 0 | 100 | 0 | 100 | 0 | 0 | 100 | 0 | 100 | 0 | 100 | 0 | 0 | 0 | 100 | 0 | 100 | 0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 100 | 100 | 100 | 0 | 100 | 0 | 0 | | 100 | 0 | 100 | 0 | 100 | 100 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 100 |
| Quantitative | Linear | | 0 | 100 | 100 | 0 | 0 | 100 | 0 | 100 | 100 | 100 | 0 | 100 | 100 | 0 | 0 | 100 | 100 | 100 | | 0 | 0 | 0 | 100 | 100 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 100 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 100 | 100 | 0 | 0 | 0 | Quadratic | | 100 | 0 | 100 | 0 | 0 | 100 | 0 | 0 | 100 | 0 | 100 | 100 | 0 | 0 | 0 | 100 | 0 | 100 | | 0 | 0 | 0 | 100 | 100 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 100 | 100 | 0 | 100 | 0 | | 0 | 100 | 0 | 0 | 0 | 0 | 100 | 100 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 |
|
|