Research Article

A Machine Learning Approach to Assess Differential Item Functioning in Psychometric Questionnaires Using the Elastic Net Regularized Ordinal Logistic Regression in Small Sample Size Groups

Table 1

The powers of the regularized (elastic net) and non-regularized OLR models in detecting moderate uniform DIF (DIF=0.4) when J=5.

IRatioNOLRRidgeElastic net OLRLASSO
w=0w=0.01w=0.02w=0.03w=0.04w=0.05w=0.06w=0.07w=0.1w=0.5w=1

5nr=nf1000.1790.1000.0980.1460.1750.1920.2150.2240.2330.2470.2770.281
1500.3170.1840.1770.2650.3100.3320.3470.3570.3610.3700.4120.413
2000.3650.2170.2100.1900.3600.3890.4000.4090.4180.4330.4790.483
3000.5590.4090.3920.5180.5720.5940.6100.6210.6250.6320.6730.677
4000.7020.5500.5280.6730.7340.7540.7700.7730.7720.7740.8080.811

5nr=2nf1000.1610.0680.0650.1260.1550.1740.1880.1990.2070.2150.2570.260
1500.2790.1400.1290.2300.2660.2910.3000.3090.3200.3330.3660.366
2000.3340.2040.1960.2870.3330.3600.3750.3830.3970.4090.4390.441
3000.4970.3440.3290.4520.4980.5270.5440.5560.5620.5790.6260.632
4000.6290.4990.4740.5980.6440.6610.6830.6940.7000.7080.7370.739

5nr=3nf1000.1430.0670.0640.1080.1350.1490.1580.1670.1750.1900.2210.225
1500.2190.1080.1030.1760.2110.2320.2440.2490.2570.2780.3060.308
2000.2790.1710.1600.2500.2800.3000.3140.3220.3310.3460.3790.379
3000.4300.2790.2650.3730.4320.4570.4720.4810.4860.5030.5430.546
4000.5390.3970.3810.5070.5560.5830.5960.6040.6100.6190.6520.655

λBIC-0.3800.3810.1900.1300.0950.0760.0630.0540.0380.0080.004

10nr=nf1000.1170.0750.0720.1160.1430.1530.1610.1630.1660.1710.1890.190
1500.1730.1380.1330.1840.2160.2320.2350.2400.2450.2560.2720.277
2000.2480.1890.1830.2620.2850.3050.3180.3240.3330.3430.3550.358
3000.3500.2830.2760.3600.4050.4240.4320.4400.4420.4520.4720.473
4000.4620.4100.3940.5020.5310.5480.5580.5620.5630.5650.5870.587

10nr=2nf1000.1020.0720.0700.1120.1230.1350.1420.1450.1470.1560.1650.166
1500.1670.1210.1200.1720.1980.2110.2220.2280.2320.2380.2580.258
2000.2070.1440.1420.2180.2410.2500.2590.2630.2670.2750.2930.293
3000.3140.2560.2420.3320.3640.3800.3940.4010.4030.4100.4320.434
4000.3890.3330.3240.4170.4560.4790.4870.4920.4990.5110.5370.537

10nr=3nf1000.0990.0640.0620.0980.1190.1330.1410.1460.1480.1500.1580.159
1500.1460.0980.0960.1500.1750.1880.1940.1990.2030.2110.2200.222
2000.1680.1140.1100.1650.2000.2200.2240.2290.2340.2450.2740.276
3000.2640.2040.1960.2720.3000.3180.3300.3360.3450.3540.3750.376
4000.3490.2810.2650.3670.3900.4110.4260.4330.4410.4590.4820.483

λBIC-0.3150.3150.1600.1050.0800.0630.0520.0450.0320.0060.003

Note: DIF: differential item functioning; I: number of items in the scale; J: number of response categories; LASSO: least absolute shrinkage and selection operator; λ: regularization parameter; OLR: ordinal logistic regression; w: weighting parameter; Ratio: sample size ratio between the focal and reference groups; nf and nr indicate the sample sizes in the focal and reference groups, respectively; N: the total sample size (N=nf +nr). These λ values were obtained according to the Bayesian information criterion (BIC).