GMAC: A Distributional Perspective on Actor-Critic Framework

05/24/2021
by   Daniel Wontae Nam, et al.
0

In this paper, we devise a distributional framework on actor-critic as a solution to distributional instability, action type restriction, and conflation between samples and statistics. We propose a new method that minimizes the Cramér distance with the multi-step Bellman target distribution generated from a novel Sample-Replacement algorithm denoted SR(λ), which learns the correct value distribution under multiple Bellman operations. Parameterizing a value distribution with Gaussian Mixture Model further improves the efficiency and the performance of the method, which we name GMAC. We empirically show that GMAC captures the correct representation of value distributions and improves the performance of a conventional actor-critic method with low computational cost, in both discrete and continuous action spaces using Arcade Learning Environment (ALE) and PyBullet environment.

READ FULL TEXT

page 1

page 18

page 21

research
03/04/2022

A Small Gain Analysis of Single Timescale Actor Critic

We consider a version of actor-critic which uses proportional step-sizes...
research
06/10/2018

Distributional Advantage Actor-Critic

In traditional reinforcement learning, an agent maximizes the reward col...
research
12/29/2022

Invariance to Quantile Selection in Distributional Continuous Control

In recent years distributional reinforcement learning has produced many ...
research
06/23/2022

CGAR: Critic Guided Action Redistribution in Reinforcement Leaning

Training a game-playing reinforcement learning agent requires multiple i...
research
09/21/2022

Revisiting Discrete Soft Actor-Critic

We study the adaption of soft actor-critic (SAC) from continuous action ...
research
07/13/2020

Implicit Distributional Reinforcement Learning

To improve the sample efficiency of policy-gradient based reinforcement ...
research
06/11/2023

PACER: A Fully Push-forward-based Distributional Reinforcement Learning Algorithm

In this paper, we propose the first fully push-forward-based Distributio...

Please sign up or login with your details

Forgot password? Click here to reset