ExTra: Transfer-guided Exploration

06/27/2019
by   Anirban Santara, et al.
8

In this work we present a novel approach for transfer-guided exploration in reinforcement learning that is inspired by the human tendency to leverage experiences from similar encounters in the past while navigating a new task. Given an optimal policy in a related task-environment, we show that its bisimulation distance from the current task-environment gives a lower bound on the optimal advantage of state-action pairs in the current task-environment. Transfer-guided Exploration (ExTra) samples actions from a Softmax distribution over these lower bounds. In this way, actions with potentially higher optimum advantage are sampled more frequently. In our experiments on gridworld environments, we demonstrate that given access to an optimal policy in a related task-environment, ExTra can outperform popular domain-specific exploration strategies viz. epsilon greedy, Model-Based Interval Estimation - Exploration Based (MBIE-EB), Pursuit and Boltzmann in terms of sample complexity and rate of convergence. We further show that ExTra is robust to choices of source task and shows a graceful degradation of performance as the dissimilarity of the source task increases. We also demonstrate that ExTra, when used alongside traditional exploration algorithms, improves their rate of convergence. Thus it is capable of complimenting the efficacy of traditional exploration algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2022

The Role of Exploration for Task Transfer in Reinforcement Learning

The exploration–exploitation trade-off in reinforcement learning (RL) is...
research
02/10/2021

Task-Optimal Exploration in Linear Dynamical Systems

Exploration in unknown environments is a fundamental problem in reinforc...
research
11/28/2022

Inapplicable Actions Learning for Knowledge Transfer in Reinforcement Learning

Reinforcement Learning (RL) algorithms are known to scale poorly to envi...
research
06/20/2022

Sampling Efficient Deep Reinforcement Learning through Preference-Guided Stochastic Exploration

Massive practical works addressed by Deep Q-network (DQN) algorithm have...
research
05/29/2022

Provable Benefits of Representational Transfer in Reinforcement Learning

We study the problem of representational transfer in RL, where an agent ...
research
07/10/2019

An Intrinsically-Motivated Approach for Learning Highly Exploring and Fast Mixing Policies

What is a good exploration strategy for an agent that interacts with an ...
research
05/20/2018

Human-guided data exploration using randomisation

An explorative data analysis system should be aware of what the user alr...

Please sign up or login with your details

Forgot password? Click here to reset