Increasing Students' Engagement to Reminder Emails Through Multi-Armed Bandits

08/10/2022
∙
by   Fernando J. Yanez, et al.
∙
2
∙

Conducting randomized experiments in education settings raises the question of how we can use machine learning techniques to improve educational interventions. Using Multi-Armed Bandits (MAB) algorithms like Thompson Sampling (TS) in adaptive experiments can increase students' chances of obtaining better outcomes by increasing the probability of assignment to the most optimal condition (arm), even before an intervention completes. This is an advantage over traditional A/B testing, which may allocate an equal number of students to both optimal and non-optimal conditions. The problem is the exploration-exploitation trade-off. Even though adaptive policies aim to collect enough information to allocate more students to better arms reliably, past work shows that this may not be enough exploration to draw reliable conclusions about whether arms differ. Hence, it is of interest to provide additional uniform random (UR) exploration throughout the experiment. This paper shows a real-world adaptive experiment on how students engage with instructors' weekly email reminders to build their time management habits. Our metric of interest is open email rates which tracks the arms represented by different subject lines. These are delivered following different allocation algorithms: UR, TS, and what we identified as TS† - which combines both TS and UR rewards to update its priors. We highlight problems with these adaptive algorithms - such as possible exploitation of an arm when there is no significant difference - and address their causes and consequences. Future directions includes studying situations where the early choice of the optimal arm is not ideal and how adaptive algorithms can address them.

READ FULL TEXT
research
∙ 10/13/2020

Multi-Armed Bandits with Dependent Arms

We study a variant of the classical multi-armed bandit problem (MABP) wh...
research
∙ 08/10/2022

Using Adaptive Experiments to Rapidly Help Students

Adaptive experiments can increase the chance that current students obtai...
research
∙ 04/14/2018

Combining Difficulty Ranking with Multi-Armed Bandits to Sequence Educational Content

As e-learning systems become more prevalent, there is a growing need for...
research
∙ 12/15/2021

Algorithms for Adaptive Experiments that Trade-off Statistical Analysis with Reward: Combining Uniform Random Assignment and Reward Maximization

Multi-armed bandit algorithms like Thompson Sampling can be used to cond...
research
∙ 01/28/2022

Networked Restless Multi-Armed Bandits for Mobile Interventions

Motivated by a broad class of mobile intervention problems, we propose a...
research
∙ 05/26/2020

To update or not to update? Delayed Nonparametric Bandits with Randomized Allocation

Delayed rewards problem in contextual bandits has been of interest in va...

Please sign up or login with your details

Forgot password? Click here to reset