Actor-Expert: A Framework for using Action-Value Methods in Continuous Action Spaces

10/22/2018
by   Sungsu Lim, et al.
6

Value-based approaches can be difficult to use in continuous action spaces, because an optimization has to be solved to find the greedy action for the action-values. A common strategy has been to restrict the functional form of the action-values to be convex or quadratic in the actions, to simplify this optimization. Such restrictions, however, can prevent learning accurate action-values. In this work, we propose the Actor-Expert framework for value-based methods, that decouples action-selection (Actor) from the action-value representation (Expert). The Expert uses Q-learning to update the action-values towards the optimal action-values, whereas the Actor (learns to) output the greedy action for the current action-values. We develop a Conditional Cross Entropy Method for the Actor, to learn the greedy action for a generically parameterized Expert, and provide a two-timescale analysis to validate asymptotic behavior. We demonstrate in a toy domain with bimodal action-values that previous restrictive action-value methods fail whereas the decoupled Actor-Expert with a more general action-value parameterization succeeds. Finally, we demonstrate that Actor-Expert performs as well as or better than these other methods on several benchmark continuous-action domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2018

Inspiration Learning through Preferences

Current imitation learning techniques are too restrictive because they r...
research
09/26/2019

CAQL: Continuous Action Q-Learning

Value-based reinforcement learning (RL) methods like Q-learning have sho...
research
04/15/2019

Learning Probabilistic Multi-Modal Actor Models for Vision-Based Robotic Grasping

Many previous works approach vision-based robotic grasping by training a...
research
11/05/2019

Quinoa: a Q-function You Infer Normalized Over Actions

We present an algorithm for learning an approximate action-value soft Q-...
research
11/02/2020

Actor and Action Modular Network for Text-based Video Segmentation

The actor and action semantic segmentation is a challenging problem that...
research
03/20/2018

Actor and Action Video Segmentation from a Sentence

This paper strives for pixel-level segmentation of actors and their acti...
research
07/20/2022

ERA: Expert Retrieval and Assembly for Early Action Prediction

Early action prediction aims to successfully predict the class label of ...

Please sign up or login with your details

Forgot password? Click here to reset