Towards Task-Prioritized Policy Composition

09/20/2022
by   Finn Rietz, et al.
0

Combining learned policies in a prioritized, ordered manner is desirable because it allows for modular design and facilitates data reuse through knowledge transfer. In control theory, prioritized composition is realized by null-space control, where low-priority control actions are projected into the null-space of high-priority control actions. Such a method is currently unavailable for Reinforcement Learning. We propose a novel, task-prioritized composition framework for Reinforcement Learning, which involves a novel concept: The indifferent-space of Reinforcement Learning policies. Our framework has the potential to facilitate knowledge transfer and modular design while greatly increasing data efficiency and data reuse for Reinforcement Learning agents. Further, our approach can ensure high-priority constraint satisfaction, which makes it promising for learning in safety-critical domains like robotics. Unlike null-space control, our approach allows learning globally optimal policies for the compound task by online learning in the indifference-space of higher-level policies after initial compound policy construction.

READ FULL TEXT
research
02/19/2020

Efficient Deep Reinforcement Learning through Policy Transfer

Transfer Learning (TL) has shown great potential to accelerate Reinforce...
research
06/05/2021

Learning Routines for Effective Off-Policy Reinforcement Learning

The performance of reinforcement learning depends upon designing an appr...
research
05/25/2019

Composing Ensembles of Policies with Deep Reinforcement Learning

Composition of elementary skills into complex behaviors to solve challen...
research
07/12/2018

A Library for Constraint Consistent Learning

This paper introduces the first, open source software library for Constr...
research
11/04/2021

Successor Feature Neural Episodic Control

A longstanding goal in reinforcement learning is to build intelligent ag...
research
10/26/2018

Neural Modular Control for Embodied Question Answering

We present a modular approach for learning policies for navigation over ...
research
05/11/2021

Composable Energy Policies for Reactive Motion Generation and Reinforcement Learning

Reactive motion generation problems are usually solved by computing acti...

Please sign up or login with your details

Forgot password? Click here to reset