Boosting the Actor with Dual Critic

12/29/2017
by   Bo Dai, et al.
0

This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between the actor and a critic-like function, which is named as dual critic. Compared to its actor-critic relatives, Dual-AC has the desired property that the actor and dual critic are updated cooperatively to optimize the same objective function, providing a more transparent way for learning the critic that is directly related to the objective function of the actor. We then provide a concrete algorithm that can effectively solve the minimax optimization problem, using techniques of multi-step bootstrapping, path regularization, and stochastic dual ascent algorithm. We demonstrate that the proposed algorithm achieves the state-of-the-art performances across several benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2018

TD-Regularized Actor-Critic Methods

Actor-critic methods can achieve incredible performance on difficult rei...
research
01/30/2023

PAC-Bayesian Soft Actor-Critic Learning

Actor-critic algorithms address the dual goals of reinforcement learning...
research
03/24/2022

Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

Driving 3D characters to dance following a piece of music is highly chal...
research
10/21/2021

Finite-Time Complexity of Online Primal-Dual Natural Actor-Critic Algorithm for Constrained Markov Decision Processes

We consider a discounted cost constrained Markov decision process (CMDP)...
research
08/19/2023

PACE: Improving Prompt with Actor-Critic Editing for Large Language Model

Large language models (LLMs) have showcased remarkable potential across ...
research
12/20/2022

Hybrid Rule-Neural Coreference Resolution System based on Actor-Critic Learning

A coreference resolution system is to cluster all mentions that refer to...
research
05/23/2019

Exploiting Cognitive Structure for Adaptive Learning

Adaptive learning, also known as adaptive teaching, relies on learning p...

Please sign up or login with your details

Forgot password? Click here to reset