Improving Policy Optimization with Generalist-Specialist Learning

06/26/2022
by   Zhiwei Jia, et al.
0

Generalization in deep reinforcement learning over unseen environment variations usually requires policy learning over a large set of diverse training variations. We empirically observe that an agent trained on many variations (a generalist) tends to learn faster at the beginning, yet its performance plateaus at a less optimal level for a long time. In contrast, an agent trained only on a few variations (a specialist) can often achieve high returns under a limited computational budget. To have the best of both worlds, we propose a novel generalist-specialist training framework. Specifically, we first train a generalist on all environment variations; when it fails to improve, we launch a large population of specialists with weights cloned from the generalist, each trained to master a selected small subset of variations. We finally resume the training of the generalist with auxiliary rewards induced by demonstrations of all specialists. In particular, we investigate the timing to start specialist training and compare strategies to learn generalists with assistance from specialists. We show that this framework pushes the envelope of policy learning on several challenging and popular benchmarks including Procgen, Meta-World and ManiSkill.

READ FULL TEXT

page 4

page 16

research
02/23/2021

School of hard knocks: Curriculum analysis for Pommerman with a fixed computational budget

Pommerman is a hybrid cooperative/adversarial multi-agent environment, w...
research
10/11/2021

Learning a subspace of policies for online adaptation in Reinforcement Learning

Deep Reinforcement Learning (RL) is mainly studied in a setting where th...
research
03/12/2020

Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft

Sample inefficiency of deep reinforcement learning methods is a major ob...
research
02/20/2017

Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning

Reinforcement Learning algorithms can learn complex behavioral patterns ...
research
06/14/2020

Reinforcement Learning with Supervision from Noisy Demonstrations

Reinforcement learning has achieved great success in various application...
research
09/27/2018

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

We propose a method to efficiently learn diverse strategies in reinforce...
research
08/11/2023

Reinforcement Logic Rule Learning for Temporal Point Processes

We propose a framework that can incrementally expand the explanatory tem...

Please sign up or login with your details

Forgot password? Click here to reset