Research Article
Random Forests in Count Data Modelling: An Analysis of the Influence of Data Features and Overdispersion on Regression Performance
Table 2
Effect of predictor types and dispersion amplitude on the number of variables randomly selected at each split.
| Data types | Variance-to-mean relationship | Best mtry | N = 50 (%) | N = 250 (%) | N = 1250 (%) | | | | | | | | | | | | | | | | | | | | | | | | |
| Categorical | Linear | 2 | 81 | 89 | 90 | 56 | 58 | 65 | 98 | 97 | 99 | 80 | 84 | 86 | 100 | 100 | 100 | 91 | 92 | 94 | | 9 | 3 | 6 | 27 | 27 | 30 | 1 | 3 | 1 | 17 | 15 | 14 | 0 | 0 | 0 | 9 | 8 | 6 | | 8 | 7 | 4 | 9 | 11 | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 2 | 1 | 0 | 8 | 4 | 0 | 1 | 0 | 0 | 3 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Quadratic | 2 | 89 | 88 | 88 | 62 | 74 | 74 | 98 | 99 | 99 | 81 | 85 | 92 | 100 | 100 | 100 | 97 | 99 | 98 | | 7 | 8 | 4 | 28 | 16 | 18 | 2 | 0 | 1 | 18 | 15 | 8 | 0 | 0 | 0 | 3 | 1 | 2 | | 2 | 3 | 1 | 4 | 7 | 6 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 2 | 1 | 7 | 6 | 3 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 25% of predictors are quantitative | Linear | 2 | 0 | 0 | 100 | 0 | 100 | 100 | 100 | 100 | 100 | 100 | 0 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 100 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Quadratic | 2 | 100 | 100 | 100 | 100 | 100 | 0 | 100 | 100 | 100 | 100 | 0 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 50% of predictors are quantitative | Linear | 2 | 100 | 100 | 100 | 0 | 0 | 0 | 100 | 100 | 0 | 0 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 100 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Quadratic | 2 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 |
| 75% of predictors are quantitative | Linear | 2 | 100 | 100 | 100 | 0 | 100 | 100 | 100 | 100 | 100 | 100 | 0 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Quadratic | 2 | 100 | 0 | 100 | 0 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | | 0 | 100 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Quantitative | Linear | 2 | 100 | 0 | 100 | 100 | 100 | 100 | 0 | 100 | 100 | 0 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | | 0 | 100 | 0 | 0 | 0 | 0 | 100 | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Quadratic | 2 | 100 | 100 | 0 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | | 0 | 0 | 100 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
|
|