Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning

11/29/2022
by   Guoxi Zhang, et al.
0

Offline reinforcement learning (RL) have received rising interest due to its appealing data efficiency. The present study addresses behavior estimation, a task that lays the foundation of many offline RL algorithms. Behavior estimation aims at estimating the policy with which training data are generated. In particular, this work considers a scenario where the data are collected from multiple sources. In this case, neglecting data heterogeneity, existing approaches for behavior estimation suffers from behavior misspecification. To overcome this drawback, the present study proposes a latent variable model to infer a set of policies from data, which allows an agent to use as behavior policy the policy that best describes a particular trajectory. This model provides with a agent fine-grained characterization for multi-source data and helps it overcome behavior misspecification. This work also proposes a learning algorithm for this model and illustrates its practical usage via extending an existing offline RL algorithm. Lastly, with extensive evaluation this work confirms the existence of behavior misspecification and the efficacy of the proposed model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2023

Behavior Proximal Policy Optimization

Offline reinforcement learning (RL) is a challenging setting where exist...
research
05/24/2023

Matrix Estimation for Offline Reinforcement Learning with Low-Rank Structure

We consider offline Reinforcement Learning (RL), where the agent does no...
research
06/16/2023

π2vec: Policy Representations with Successor Features

This paper describes π2vec, a method for representing behaviors of black...
research
07/26/2022

Offline Reinforcement Learning at Multiple Frequencies

Leveraging many sources of offline robot data requires grappling with th...
research
06/14/2023

Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources

Existing theoretical studies on offline reinforcement learning (RL) most...
research
06/27/2020

Overfitting and Optimization in Offline Policy Learning

We consider the task of policy learning from an offline dataset generate...
research
07/10/2023

Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning

Offline Reinforcement Learning (RL) methods leverage previous experience...

Please sign up or login with your details

Forgot password? Click here to reset