DeepAI AI Chat
Log In Sign Up

Transferred Q-learning

by   Elynn Y. Chen, et al.
Renmin University of China
NYU college
berkeley college

We consider Q-learning with knowledge transfer, using samples from a target reinforcement learning (RL) task as well as source samples from different but related RL tasks. We propose transfer learning algorithms for both batch and online Q-learning with offline source studies. The proposed transferred Q-learning algorithm contains a novel re-targeting step that enables vertical information-cascading along multiple steps in an RL task, besides the usual horizontal information-gathering as transfer learning (TL) for supervised learning. We establish the first theoretical justifications of TL in RL tasks by showing a faster rate of convergence of the Q function estimation in the offline RL transfer, and a lower regret bound in the offline-to-online RL transfer under certain similarity assumptions. Empirical evidences from both synthetic and real datasets are presented to back up the proposed algorithm and our theoretical results.


page 1

page 2

page 3

page 4


Target Transfer Q-Learning and Its Convergence Analysis

Q-learning is one of the most popular methods in Reinforcement Learning ...

Importance Weighted Transfer of Samples in Reinforcement Learning

We consider the transfer of experience samples (i.e., tuples < s, a, s',...

A Taxonomy of Similarity Metrics for Markov Decision Processes

Although the notion of task similarity is potentially interesting in a w...

Transfer of Temporal Logic Formulas in Reinforcement Learning

Transferring high-level knowledge from a source task to a target task is...

Robust Knowledge Transfer in Tiered Reinforcement Learning

In this paper, we study the Tiered Reinforcement Learning setting, a par...

Accumulating Knowledge for Lifelong Online Learning

Lifelong learning can be viewed as a continuous transfer learning proced...

Transfer Learning for Operator Selection: A Reinforcement Learning Approach

In the past two decades, metaheuristic optimization algorithms (MOAs) ha...

Code Repositories