Interaction-Grounded Learning with Action-inclusive Feedback

06/16/2022
by   Tengyang Xie, et al.
2

Consider the problem setting of Interaction-Grounded Learning (IGL), in which a learner's goal is to optimally interact with the environment with no explicit reward to ground its policies. The agent observes a context vector, takes an action, and receives a feedback vector, using this information to effectively optimize a policy with respect to a latent reward function. Prior analyzed approaches fail when the feedback vector contains the action, which significantly limits IGL's success in many potential scenarios such as Brain-computer interface (BCI) or Human-computer interface (HCI) applications. We address this by creating an algorithm and analysis which allows IGL to work even when the feedback vector contains the action, encoded in any fashion. We provide theoretical guarantees and large-scale experiments based on supervised datasets to demonstrate the effectiveness of the new approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2021

Interaction-Grounded Learning

Consider a prosthetic arm, learning to adapt to its user's control signa...
research
02/19/2019

Learning to Generalize from Sparse and Underspecified Rewards

We consider the problem of learning from sparse and underspecified rewar...
research
11/28/2022

Personalized Reward Learning with Interaction-Grounded Learning (IGL)

In an era of countless content offerings, recommender systems alleviate ...
research
05/29/2023

How to Query Human Feedback Efficiently in RL?

Reinforcement Learning with Human Feedback (RLHF) is a paradigm in which...
research
04/12/2021

An Efficient Algorithm for Deep Stochastic Contextual Bandits

In stochastic contextual bandit (SCB) problems, an agent selects an acti...
research
06/23/2020

Environment Shaping in Reinforcement Learning using State Abstraction

One of the central challenges faced by a reinforcement learning (RL) age...
research
08/03/2023

Fast Slate Policy Optimization: Going Beyond Plackett-Luce

An increasingly important building block of large scale machine learning...

Please sign up or login with your details

Forgot password? Click here to reset