Jump Operator Planning: Goal-Conditioned Policy Ensembles and Zero-Shot Transfer

07/06/2020
by   Thomas J. Ringstrom, et al.
0

In Hierarchical Control, compositionality, abstraction, and task-transfer are crucial for designing versatile algorithms which can solve a variety of problems with maximal representational reuse. We propose a novel hierarchical and compositional framework called Jump-Operator Dynamic Programming for quickly computing solutions within a super-exponential space of sequential sub-goal tasks with ordering constraints, while also providing a fast linearly-solvable algorithm as an implementation. This approach involves controlling over an ensemble of reusable goal-conditioned polices functioning as temporally extended actions, and utilizes transition operators called feasibility functions, which are used to summarize initial-to-final state dynamics of the polices. Consequently, the added complexity of grounding a high-level task space onto a larger ambient state-space can be mitigated by optimizing in a lower-dimensional subspace defined by the grounding, substantially improving the scalability of the algorithm while effecting transferable solutions. We then identify classes of objective functions on this subspace whose solutions are invariant to the grounding, resulting in optimal zero-shot transfer.

READ FULL TEXT
research
11/28/2022

Hypernetworks for Zero-shot Transfer in Reinforcement Learning

In this paper, hypernetworks are trained to generate behaviors across a ...
research
09/21/2023

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

3D visual grounding is a critical skill for household robots, enabling t...
research
11/24/2018

Hardware Conditioned Policies for Multi-Robot Transfer Learning

Deep reinforcement learning could be used to learn dexterous robotic pol...
research
08/20/2019

Zero-Shot Grounding of Objects from Natural Language Queries

A phrase grounding system localizes a particular object in an image refe...
research
11/20/2022

Reward is not Necessary: How to Create a Compositional Self-Preserving Agent for Life-Long Learning

We introduce a physiological model-based agent as proof-of-principle tha...
research
12/18/2022

Planning Immediate Landmarks of Targets for Model-Free Skill Transfer across Agents

In reinforcement learning applications like robotics, agents usually nee...
research
09/29/2018

Refining Manually-Designed Symbol Grounding and High-Level Planning by Policy Gradients

Hierarchical planners that produce interpretable and appropriate plans a...

Please sign up or login with your details

Forgot password? Click here to reset