Compositionality and Bounds for Optimal Value Functions in Reinforcement Learning

02/19/2023
by   Jacob Adamczyk, et al.
0

An agent's ability to reuse solutions to previously solved problems is critical for learning new tasks efficiently. Recent research using composition of value functions in reinforcement learning has shown that agents can utilize solutions of primitive tasks to obtain solutions for exponentially many new tasks. However, previous work has relied on restrictive assumptions on the dynamics, the method of composition, and the structure of reward functions. Here we consider the case of general composition functions without any restrictions on the structure of reward functions, applicable to both deterministic and stochastic dynamics. For this general setup, we provide bounds on the corresponding optimal value functions and characterize the value of corresponding policies. The theoretical results derived lead to improvements in training for both entropy-regularized and standard reinforcement learning, which we validate with numerical simulations.

READ FULL TEXT

page 2

page 12

page 13

page 14

research
12/02/2022

Utilizing Prior Solutions for Reward Shaping and Composition in Entropy-Regularized Reinforcement Learning

In reinforcement learning (RL), the ability to utilize prior knowledge f...
research
03/05/2023

Bounding the Optimal Value Function in Compositional Reinforcement Learning

In the field of reinforcement learning (RL), agents are often tasked wit...
research
07/01/2022

Modular Lifelong Reinforcement Learning via Neural Composition

Humans commonly solve complex problems by decomposing them into easier s...
research
03/13/2018

Hierarchical Reinforcement Learning: Approximating Optimal Discounted TSP Using Local Policies

In this work, we provide theoretical guarantees for reward decomposition...
research
03/14/2022

Orchestrated Value Mapping for Reinforcement Learning

We present a general convergent class of reinforcement learning algorith...
research
07/10/2023

Dynamics of Temporal Difference Reinforcement Learning

Reinforcement learning has been successful across several applications i...
research
06/09/2011

Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks

This paper discusses a system that accelerates reinforcement learning by...

Please sign up or login with your details

Forgot password? Click here to reset