Performance of Multi-group DIF Methods in Assessing Cross-Country Score Comparability of International Large-Scale Assessments

07/31/2020
by   Dandan Chen, et al.
0

Standardized large-scale testing can be a debatable topic, in which test fairness sits at its very core. This study found that two out of five recent multi-group DIF detection methods are capable of capturing both the uniform and nonuniform DIF that affects test fairness. Still, no prior research has demonstrated the relative performance of these two methods when they are compared with each other. These two methods are the improved Wald test and the generalized logistic regression procedure. This study assessed the commonalities and differences between two sets of empirical results from these two methods with the latest TIMSS math score data. The primary conclusion was that the improved Wald test is relatively more established than the generalized logistic regression procedure for multi-group DIF analysis. Empirical results from this study may inform the selection of a multi-group DIF method in the ILSA score analysis.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro