Lipschitz Lifelong Reinforcement Learning

01/15/2020
by   Erwan Lecarpentier, et al.
18

We consider the problem of knowledge transfer when an agent is facing a series of Reinforcement Learning (RL) tasks. We introduce a novel metric between Markov Decision Processes and establish that close MDPs have close optimal value functions. Formally, the optimal value functions are Lipschitz continuous with respect to the tasks space. These theoretical results lead us to a value transfer method for Lifelong RL, which we use to build a PAC-MDP algorithm with improved convergence rate. We illustrate the benefits of the method in Lifelong RL experiments.

READ FULL TEXT
research
08/22/2019

Opponent Aware Reinforcement Learning

In several reinforcement learning (RL) scenarios such as security settin...
research
06/10/2015

The Online Coupon-Collector Problem and Its Application to Lifelong Reinforcement Learning

Transferring knowledge across a sequence of related tasks is an importan...
research
09/21/2018

Target Transfer Q-Learning and Its Convergence Analysis

Q-learning is one of the most popular methods in Reinforcement Learning ...
research
09/05/2018

Reinforcement Learning under Threats

In several reinforcement learning (RL) scenarios, mainly in security set...
research
02/14/2022

Convex Programs and Lyapunov Functions for Reinforcement Learning: A Unified Perspective on the Analysis of Value-Based Methods

Value-based methods play a fundamental role in Markov decision processes...
research
06/29/2023

Eigensubspace of Temporal-Difference Dynamics and How It Improves Value Approximation in Reinforcement Learning

We propose a novel value approximation method, namely Eigensubspace Regu...
research
02/15/2022

L2C2: Locally Lipschitz Continuous Constraint towards Stable and Smooth Reinforcement Learning

This paper proposes a new regularization technique for reinforcement lea...

Please sign up or login with your details

Forgot password? Click here to reset