Regularized Bayesian calibration and scoring of the WD-FAB IRT model improves predictive performance over maximum marginal likelihood

10/03/2020
by   Joshua C. Chang, et al.
0

Item response theory (IRT) is the statistical paradigm underlying a dominant family of generative probabilistic models for test responses, used to quantify traits in individuals relative to target populations. The graded response model (GRM) is a particular IRT model that is used for ordered polytomous test responses. Both the development and the application of the GRM and other IRT models require statistical decisions. For formulating these models (calibration), one needs to decide on methodologies for item selection, inference, and regularization. For applying these models (test scoring), one needs to make similar decisions, often prioritizing computational tractability and/or interpretability. In many applications, such as in the Work Disability Functional Assessment Battery (WD-FAB), tractability implies approximating an individual's score distribution using estimates of mean and variance, and obtaining that score conditional on only point estimates of the calibrated model. In this manuscript, we evaluate the calibration and scoring of models under this common use-case using Bayesian cross-validation. Applied to the WD-FAB responses collected for the National Institutes of Health, we assess the predictive power of implementations of the GRM based on their ability to yield, on validation sets of respondents, estimates of latent ability with uncertainty that are most predictive of patterns of item responses. IRT models in-general have the concrete interpretation of latent abilities, combining with item parameters, to produce predictions of response patterns. Our main finding is that regularized Bayesian calibration of the GRM outperforms the prior-free empirical Bayesian procedure of maximum marginal likelihood. We also motivate the use of compactly supported priors in test scoring.

READ FULL TEXT
research
12/05/2019

Probabilistically-autoencoded horseshoe-disentangled multidomain item-response theory models

Item response theory (IRT) is a non-linear generative probabilistic para...
research
05/21/2019

On the marginal likelihood and cross-validation

In Bayesian statistics, the marginal likelihood, also known as the evide...
research
05/23/2019

Bayesian Item Response Modelling in R with brms and Stan

Item Response Theory (IRT) is widely applied in the human sciences to mo...
research
03/16/2020

Bayesian item response models for citizen science ecological data

So-called citizen science data elicited from crowds has become increasin...
research
10/22/2019

Flexible Bayesian modelling in dichotomous item response theory using mixtures of skewed item curves

Most Item Response Theory (IRT) models for dichotomous responses are bas...
research
04/08/2022

Latent Trait Item Response Models for Continuous Responses

A general framework of latent trait item response models for continuous ...

Please sign up or login with your details

Forgot password? Click here to reset