Predicting respondent difficulty in web surveys: A machine-learning approach based on mouse movement features

A central goal of survey research is to collect robust and reliable data from respondents. However, despite researchers' best efforts in designing questionnaires, respondents may experience difficulty understanding questions' intent and therefore may struggle to respond appropriately. If it were possible to detect such difficulty, this knowledge could be used to inform real-time interventions through responsive questionnaire design, or to indicate and correct measurement error after the fact. Previous research in the context of web surveys has used paradata, specifically response times, to detect difficulties and to help improve user experience and data quality. However, richer data sources are now available, in the form of the movements respondents make with the mouse, as an additional and far more detailed indicator for the respondent-survey interaction. This paper uses machine learning techniques to explore the predictive value of mouse-tracking data with regard to respondents' difficulty. We use data from a survey on respondents' employment history and demographic information, in which we experimentally manipulate the difficulty of several questions. Using features derived from the cursor movements, we predict whether respondents answered the easy or difficult version of a question, using and comparing several state-of-the-art supervised learning methods. In addition, we develop a personalization method that adjusts for respondents' baseline mouse behavior and evaluate its performance. For all three manipulated survey questions, we find that including the full set of mouse movement features improved prediction performance over response-time-only models in nested cross-validation. Accounting for individual differences in mouse movements led to further improvements.

READ FULL TEXT
research
05/26/2022

Classification ensembles for multivariate functional data with application to mouse movements in web surveys

We propose new ensemble models for multivariate functional data classifi...
research
06/08/2020

Validating psychometric survey responses

We present an approach to classify user validity in survey responses by ...
research
02/11/2016

HMM and DTW for evaluation of therapeutical gestures using kinect

Automatic recognition of the quality of movement in human beings is a ch...
research
11/18/2017

Household poverty classification in data-scarce environments: a machine learning approach

We describe a method to identify poor households in data-scarce countrie...
research
09/29/2019

A Longitudinal Framework for Predicting Nonresponse in Panel Surveys

Nonresponse in panel studies can lead to a substantial loss in data qual...
research
04/09/2022

Applying machine learning to predict behavior of bus transport in Warsaw, Poland

Nowadays, it is possible to collect precise data describing movements of...
research
10/27/2021

A Shiny Application for Conducting Electronic Surveys Using Randomized Response Techniques

Randomized response techniques (RRT) are useful for collecting informati...

Please sign up or login with your details

Forgot password? Click here to reset