Value Driven Representation for Human-in-the-Loop Reinforcement Learning

04/02/2020
by   Ramtin Keramati, et al.
5

Interactive adaptive systems powered by Reinforcement Learning (RL) have many potential applications, such as intelligent tutoring systems. In such systems there is typically an external human system designer that is creating, monitoring and modifying the interactive adaptive system, trying to improve its performance on the target outcomes. In this paper we focus on algorithmic foundation of how to help the system designer choose the set of sensors or features to define the observation space used by reinforcement learning agent. We present an algorithm, value driven representation (VDR), that can iteratively and adaptively augment the observation space of a reinforcement learning agent so that is sufficient to capture a (near) optimal policy. To do so we introduce a new method to optimistically estimate the value of a policy using offline simulated Monte Carlo rollouts. We evaluate the performance of our approach on standard RL benchmarks with simulated humans and demonstrate significant improvement over prior baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2023

An Architecture for Deploying Reinforcement Learning in Industrial Environments

Industry 4.0 is driven by demands like shorter time-to-market, mass cust...
research
11/09/2021

Dealing with the Unknown: Pessimistic Offline Reinforcement Learning

Reinforcement Learning (RL) has been shown effective in domains where th...
research
06/02/2023

Efficient RL with Impaired Observability: Learning to Act with Delayed and Missing State Observations

In real-world reinforcement learning (RL) systems, various forms of impa...
research
10/07/2019

Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?

Modern deep learning methods provide an effective means to learn good re...
research
06/03/2022

Beyond Tabula Rasa: Reincarnating Reinforcement Learning

Learning tabula rasa, that is without any prior knowledge, is the preval...
research
12/02/2021

Towards Intrinsic Interactive Reinforcement Learning

Reinforcement learning (RL) and brain-computer interfaces (BCI) are two ...
research
12/30/2021

Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

Online reinforcement learning (RL) algorithms are often difficult to dep...

Please sign up or login with your details

Forgot password? Click here to reset