A Machine Learning Approach to Assess Differential Item Functioning in Psychometric Questionnaires Using the Elastic Net Regularized Ordinal Logistic Regression in Small Sample Size Groups
Table 3
The powers of the regularized (elastic net) and non-regularized OLR models in detecting severe uniform DIF (DIF=0.8) when J=5.
I
Ratio
N
OLR
Ridge
Elastic-net OLR
LASSO
w=0
w=0.01
w=0.02
w=0.03
w=0.04
w=0.05
w=0.06
w=0.07
w=0.1
w=0.5
w=1
5
nr=nf
100
0.705
0.564
0.550
0.679
0.727
0.754
0.767
0.774
0.778
0.790
0.808
0.809
150
0.867
0.789
0.781
0.860
0.889
0.901
0.906
0.910
0.914
0.917
0.931
0.932
200
0.940
0.894
0.885
0.944
0.958
0.964
0.966
0.968
0.969
0.969
0.971
0.971
300
0.995
0.985
0.984
0.996
0.997
0.997
0.997
0.997
0.997
0.997
0.997
0.997
400
0.998
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
5
nr=2nf
100
0.622
0.471
0.464
0.600
0.646
0.673
0.691
0.701
0.703
0.717
0.738
0.744
150
0.811
0.733
0.729
0.817
0.851
0.862
0.871
0.873
0.879
0.886
0.898
0.899
200
0.912
0.850
0.845
0.920
0.931
0.940
0.947
0.951
0.951
0.953
0.961
0.961
300
0.989
0.987
0.975
0.986
0.989
0.990
0.991
0.992
0.992
0.993
0.995
0.995
400
0.999
0.997
0.997
0.999
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
5
nr=3nf
100
0.557
0.400
0.393
0.519
0.567
0.559
0.613
0.623
0.631
0.648
0.668
0.670
150
0.747
0.620
0.610
0.737
0.770
0.784
0.794
0.802
0.807
0.815
0.840
0.841
200
0.873
0.785
0.777
0.862
0.892
0.907
0.913
0.915
0.918
0.918
0.931
0.932
300
0.968
0.950
0.947
0.970
0.978
0.981
0.983
0.982
0.982
0.984
0.988
0.988
400
0.995
0.985
0.984
0.995
0.995
0.996
0.996
0.997
0.997
0.997
0.997
0.997
λBIC
-
0.380
0.380
0.190
0.130
0.095
0.076
0.063
0.054
0.038
0.008
0.004
10
nr=nf
100
0.456
0.383
0.377
0.486
0.518
0.543
0.548
0.554
0.559
0.576
0.596
0.597
150
0.665
0.592
0.580
0.687
0.713
0.726
0.737
0.746
0.749
0.760
0.773
0.774
200
0.800
0.763
0.754
0.835
0.855
0.860
0.861
0.864
0.868
0.872
0.888
0.888
300
0.940
0.921
0.913
0.951
0.963
0.967
0.967
0.966
0.967
0.968
0.976
0.976
400
0.979
0.971
0.968
0.988
0.990
0.992
0.991
0.991
0.991
0.991
0.993
0.993
10
nr=2nf
100
0.341
0.336
0.331
0.433
0.485
0.503
0.518
0.523
0.522
0.534
0.545
0.547
150
0.606
0.530
0.521
0.619
0.665
0.674
0.689
0.698
0.703
0.712
0.719
0.719
200
0.748
0.687
0.676
0.770
0.796
0.809
0.813
0.814
0.820
0.827
0.832
0.832
300
0.907
0.879
0.870
0.916
0.929
0.933
0.935
0.937
0.940
0.947
0.950
0.950
400
0.965
0.958
0.955
0.973
0.978
0.979
0.981
0.981
0.982
0.982
0.987
0.987
10
nr=3nf
100
0.341
0.274
0.263
0.361
0.400
0.420
0.432
0.437
0.441
0.447
0.464
0.464
150
0.545
0.459
0.450
0.558
0.591
0.605
0.612
0.623
0.626
0.635
0.643
0.644
200
0.667
0.596
0.589
0.678
0.721
0.737
0.749
0.757
0.751
0.761
0.771
0.771
300
0.835
0.804
0.795
0.857
0.882
0.895
0.900
0.902
0.905
0.909
0.913
0.913
400
0.935
0.905
0.896
0.941
0.951
0.958
0.960
0.960
0.960
0.960
0.963
0.964
λBIC
-
0.315
0.315
0.160
0.105
0.080
0.063
0.052
0.045
0.032
0.006
0.003
Note: DIF: differential item functioning; I: number of items in the scale; J: number of response categories; LASSO: least absolute shrinkage and selection operator; λ: regularization parameter; OLR: ordinal logistic regression; w: weighting parameter; Ratio: sample size ratio between the focal and reference groups; nf and nr indicate sample sizes in the focal and reference groups, respectively; N: total sample size (N=nf +nr). These λ values were obtained according to the Bayesian information criterion (BIC).