Offline Reinforcement Learning at Multiple Frequencies

07/26/2022
by   Kaylee Burns, et al.
13

Leveraging many sources of offline robot data requires grappling with the heterogeneity of such data. In this paper, we focus on one particular aspect of heterogeneity: learning from offline data collected at different control frequencies. Across labs, the discretization of controllers, sampling rates of sensors, and demands of a task of interest may differ, giving rise to a mixture of frequencies in an aggregated dataset. We study how well offline reinforcement learning (RL) algorithms can accommodate data with a mixture of frequencies during training. We observe that the Q-value propagates at different rates for different discretizations, leading to a number of learning challenges for off-the-shelf offline RL. We present a simple yet effective solution that enforces consistency in the rate of Q-value updates to stabilize learning. By scaling the value of N in N-step returns with the discretization size, we effectively balance Q-value propagation, leading to more stable convergence. On three simulated robotic control problems, we empirically find that this simple approach outperforms naïve mixing by 50 average.

READ FULL TEXT
research
10/19/2021

Offline Reinforcement Learning with Value-based Episodic Memory

Offline reinforcement learning (RL) shows promise of applying RL to real...
research
11/29/2022

Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning

Offline reinforcement learning (RL) have received rising interest due to...
research
09/16/2021

Conservative Data Sharing for Multi-Task Offline Reinforcement Learning

Offline reinforcement learning (RL) algorithms have shown promising resu...
research
02/10/2021

Risk-Averse Offline Reinforcement Learning

Training Reinforcement Learning (RL) agents in high-stakes applications ...
research
10/12/2022

Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories

Natural agents can effectively learn from multiple data sources that dif...
research
10/31/2022

Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information

Learning to control an agent from data collected offline in a rich pixel...
research
07/16/2020

Mixture of Step Returns in Bootstrapped DQN

The concept of utilizing multi-step returns for updating value functions...

Please sign up or login with your details

Forgot password? Click here to reset