Efficient Deep Reinforcement Learning through Policy Transfer

02/19/2020
by   Tianpei Yang, et al.
0

Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks. Existing transfer approaches either explicitly computes the similarity between tasks or select appropriate source policies to provide guided explorations for the target task. However, how to directly optimize the target policy by alternatively utilizing knowledge from appropriate source policies without explicitly measuring the similarity is currently missing. In this paper, we propose a novel Policy Transfer Framework (PTF) to accelerate RL by taking advantage of this idea. Our framework learns when and which source policy is the best to reuse for the target policy and when to terminate it by modeling multi-policy transfer as the option learning problem. PTF can be easily combined with existing deep RL approaches. Experimental results show it significantly accelerates the learning process and surpasses state-of-the-art policy transfer methods in terms of learning efficiency and final performance in both discrete and continuous action spaces.

READ FULL TEXT

page 5

page 6

page 7

research
02/19/2020

Learning When to Transfer among Agents: An Efficient Multiagent Transfer Learning Framework

Transfer Learning has shown great potential to enhance the single-agent ...
research
09/05/2019

Learning Action-Transferable Policy with Action Embedding

Despite achieving great success on performance in various sequential dec...
research
09/28/2019

MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics

Transfer reinforcement learning (RL) aims at improving learning efficien...
research
09/30/2018

Bayesian Transfer Reinforcement Learning with Prior Knowledge Rules

We propose a probabilistic framework to directly insert prior knowledge ...
research
02/25/2020

Simultaneously Evolving Deep Reinforcement Learning Models using Multifactorial Optimization

In recent years, Multifactorial Optimization (MFO) has gained a notable ...
research
09/20/2022

Towards Task-Prioritized Policy Composition

Combining learned policies in a prioritized, ordered manner is desirable...
research
08/14/2023

IOB: Integrating Optimization Transfer and Behavior Transfer for Multi-Policy Reuse

Humans have the ability to reuse previously learned policies to solve ne...

Please sign up or login with your details

Forgot password? Click here to reset