Continuous Control with Action Quantization from Demonstrations

10/19/2021
by   Robert Dadashi, et al.
0

In Reinforcement Learning (RL), discrete actions, as opposed to continuous actions, result in less complex exploration problems and the immediate computation of the maximum of the action-value function which is central to dynamic programming-based methods. In this paper, we propose a novel method: Action Quantization from Demonstrations (AQuaDem) to learn a discretization of continuous action spaces by leveraging the priors of demonstrations. This dramatically reduces the exploration problem, since the actions faced by the agent not only are in a finite number but also are plausible in light of the demonstrator's behavior. By discretizing the action space we can apply any discrete action deep RL algorithm to the continuous control problem. We evaluate the proposed method on three different setups: RL with demonstrations, RL with play data –demonstrations of a human playing in an environment but not solving any specific task– and Imitation Learning. For all three setups, we only consider human data, which is more challenging than synthetic data. We found that AQuaDem consistently outperforms state-of-the-art continuous control methods, both in terms of performance and sample efficiency. We provide visualizations and videos in the paper's website: https://google-research.github.io/aquadem.

READ FULL TEXT
research
02/05/2020

Deep RBF Value Functions for Continuous Control

A core operation in reinforcement learning (RL) is finding an action tha...
research
11/03/2021

Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies

Reinforcement learning (RL) for continuous control typically employs dis...
research
06/09/2023

Value function estimation using conditional diffusion models for control

A fairly reliable trend in deep reinforcement learning is that the perfo...
research
01/02/2022

Toward Causal-Aware RL: State-Wise Action-Refined Temporal Difference

Although it is well known that exploration plays a key role in Reinforce...
research
09/26/2019

CAQL: Continuous Action Q-Learning

Value-based reinforcement learning (RL) methods like Q-learning have sho...
research
11/22/2020

Reinforcement learning with distance-based incentive/penalty (DIP) updates for highly constrained industrial control systems

Typical reinforcement learning (RL) methods show limited applicability f...
research
09/10/2021

Discretizing Dynamics for Maximum Likelihood Constraint Inference

Maximum likelihood constraint inference is a powerful technique for iden...

Please sign up or login with your details

Forgot password? Click here to reset