Research Article

A Machine Learning Approach to Assess Differential Item Functioning in Psychometric Questionnaires Using the Elastic Net Regularized Ordinal Logistic Regression in Small Sample Size Groups

Table 4

The type I error rates of the regularized (elastic net) and non-regularized OLR models in detecting severe uniform DIF (DIF=0.8) when J=5.

IRatioNOLRRidgeElastic-net OLRLASSO
w=0w=0.01w=0.02w=0.03w=0.04w=0.05w=0.06w=0.07w=0.1w=0.5w=1

5nr=nf1000.0580.0120.0100.0270.0380.0480.0580.0660.0750.0840.1040.106
1500.0780.0140.0130.0340.0510.0650.0780.0860.0940.1080.1320.134
2000.0940.0130.0110.0380.0650.0800.0890.0980.1060.1210.1490.150
3000.1350.0180.0170.0540.0820.1080.1240.1370.1470.1660.2090.213
4000.1720.0210.0180.0610.0990.1280.1500.1670.1810.2060.2610.266

5nr=2nf1000.0590.0150.0140.0300.0420.0530.0590.0670.0720.0810.1050.107
1500.0760.0120.0120.0300.0490.0640.0720.0810.0880.1020.1270.128
2000.0800.0140.0120.0350.0550.0710.0810.0910.0990.1120.1430.145
3000.1210.0220.0200.0520.0800.0960.1110.1200.1310.1510.1940.197
4000.1550.0210.0180.0560.0890.1120.1350.1510.1620.1850.2320.234

5nr=3nf1000.0590.0120.0120.0260.0390.0480.0560.0630.0680.0760.0990.102
1500.0720.0110.0090.0320.0470.0620.0700.0780.0840.0980.1220.124
2000.0770.0150.0140.0330.0530.0650.0740.0820.0870.1010.1250.126
3000.1030.0170.0150.0420.0620.0810.0960.1080.1180.1370.1690.171
4000.1310.0180.0160.0510.0780.0990.0960.1310.1430.1660.2020.206

λBIC-0.3800.3800.1900.1300.0950.0760.0630.0540.0380.0080.004

10nr=nf1000.0320.0110.0090.0230.0320.0370.0410.0440.0480.0520.0610.062
1500.0370.0100.0100.0250.0340.0420.0470.0500.0530.0580.0690.070
2000.0390.0140.0120.0270.0370.0450.0500.0540.0570.0620.0740.075
3000.0440.0130.0110.0270.0390.0470.0530.0570.0600.0670.0780.079
4000.0470.0120.0100.0260.0390.0480.0530.0590.0630.0700.0830.085

10nr=2nf1000.0330.0100.0090.0230.0300.0350.0400.0430.0450.0500.0600.061
1500.0380.0120.0110.0260.0350.0410.0450.0500.0520.0590.0690.069
2000.0360.0100.0090.0240.0340.0410.0450.0490.0510.0570.0680.069
3000.0410.0110.0100.0250.0350.0440.0490.0540.0580.0640.0760.077
4000.0490.0120.0110.0270.0400.0480.0540.0580.0630.0700.0850.086

10nr=3nf1000.0310.0110.0100.0230.0290.0340.0380.0410.0430.0480.0580.058
1500.0350.0100.0090.0240.0330.0390.0440.0480.0510.0560.0650.066
2000.0310.0090.0090.0200.0290.0350.0400.0440.0460.0500.0600.061
3000.0450.0130.0120.0280.0380.0470.0500.0540.0580.0640.0770.078
4000.0420.0110.0090.0240.0350.0450.0500.0540.0580.0660.0780.078

λBIC-0.3150.3150.1600.1050.0800.0630.0520.0450.0320.0060.003

Note: DIF: differential item functioning; I: number of items in the scale; J: number of response categories; LASSO: least absolute shrinkage and selection operator; OLR: ordinal logistic regression; w: weighting parameter; Ratio: sample size ratio between the focal and reference groups; nf and nr indicate sample sizes in the focal and reference groups, respectively; N: total sample size (N=nf +nr). These λ values were obtained according to the Bayesian information criterion (BIC).