Reinforcement Learning in Education: A Multi-Armed Bandit Approach

11/01/2022
by   Herkulaas Combrink, et al.
0

Advances in reinforcement learning research have demonstrated the ways in which different agent-based models can learn how to optimally perform a task within a given environment. Reinforcement leaning solves unsupervised problems where agents move through a state-action-reward loop to maximize the overall reward for the agent, which in turn optimizes the solving of a specific problem in a given environment. However, these algorithms are designed based on our understanding of actions that should be taken in a real-world environment to solve a specific problem. One such problem is the ability to identify, recommend and execute an action within a system where the users are the subject, such as in education. In recent years, the use of blended learning approaches integrating face-to-face learning with online learning in the education context, has in-creased. Additionally, online platforms used for education require the automation of certain functions such as the identification, recommendation or execution of actions that can benefit the user, in this sense, the student or learner. As promising as these scientific advances are, there is still a need to conduct research in a variety of different areas to ensure the successful deployment of these agents within education systems. Therefore, the aim of this study was to contextualise and simulate the cumulative reward within an environment for an intervention recommendation problem in the education context.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2020

Reinforcement Learning in Economics and Finance

Reinforcement learning algorithms describe how an agent can learn an opt...
research
12/20/2022

Bandit approach to conflict-free multi-agent Q-learning in view of photonic implementation

Recently, extensive studies on photonic reinforcement learning to accele...
research
08/04/2015

Staged Multi-armed Bandits

In this paper, we introduce a new class of reinforcement learning method...
research
09/21/2018

Interpretable Multi-Objective Reinforcement Learning through Policy Orchestration

Autonomous cyber-physical agents and systems play an increasingly large ...
research
10/28/2018

MaxHedge: Maximising a Maximum Online with Theoretical Performance Guarantees

We introduce a new online learning framework where, at each trial, the l...
research
05/19/2022

Multi-Armed Bandits in Brain-Computer Interfaces

The multi-armed bandit (MAB) problem models a decision-maker that optimi...
research
06/09/2020

Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior

Prisoner's Dilemma mainly treat the choice to cooperate or defect as an ...

Please sign up or login with your details

Forgot password? Click here to reset