Continual Reinforcement Learning with Group Symmetries

10/21/2022
by   Mengdi Xu, et al.
0

Continual reinforcement learning (RL) aims to learn a sequence of tasks while retaining the capability to solve seen tasks and growing a new policy to solve novel tasks. Existing continual RL methods ignore that some tasks are equivalent under simple group operations, such as rotations or translations. They thus extend a new policy for each equivalent task and train the policy from scratch, resulting in poor sample complexity and generalization capability. In this work, we propose a novel continual RL framework with group symmetries, which grows a policy for each group of equivalent tasks instead of a single task. We introduce a PPO-based RL algorithm with an invariant feature extractor and a novel task grouping mechanism based on invariant features. We test our algorithm in realistic autonomous driving scenarios, where each group is associated with a map configuration. We show that our algorithm assigns tasks to different groups with high accuracy and outperforms baselines in terms of generalization capability by a large margin.

READ FULL TEXT

page 1

page 5

page 6

research
11/30/2022

General policy mapping: online continual reinforcement learning inspired on the insect brain

We have developed a model for online continual or lifelong reinforcement...
research
12/08/2021

CoMPS: Continual Meta Policy Search

We develop a new continual meta-learning method to address challenges in...
research
09/28/2022

Disentangling Transfer in Continual Reinforcement Learning

The ability of continual learning systems to transfer knowledge from pre...
research
10/13/2021

Block Contextual MDPs for Continual Learning

In reinforcement learning (RL), when defining a Markov Decision Process ...
research
05/29/2023

Continual Task Allocation in Meta-Policy Network via Sparse Prompting

How to train a generalizable meta-policy by continually learning a seque...
research
07/13/2022

Continual Meta-Reinforcement Learning for UAV-Aided Vehicular Wireless Networks

Unmanned aerial base stations (UABSs) can be deployed in vehicular wirel...
research
07/11/2019

DisCoRL: Continual Reinforcement Learning via Policy Distillation

In multi-task reinforcement learning there are two main challenges: at t...

Please sign up or login with your details

Forgot password? Click here to reset