Floyd-Warshall Reinforcement Learning Learning from Past Experiences to Reach New Goals

09/25/2018
by   Vikas Dhiman, et al.
6

Consider mutli-goal tasks that involve static environments and dynamic goals. Examples of such tasks, such as goal-directed navigation and pick-and-place in robotics, abound. Two types of Reinforcement Learning (RL) algorithms are used for such tasks: model-free or model-based. Each of these approaches has limitations. Model-free RL struggles to transfer learned information when the goal location changes, but achieves high asymptotic accuracy in single goal tasks. Model-based RL can transfer learned information to new goal locations by retaining the explicitly learned state-dynamics, but is limited by the fact that small errors in modelling these dynamics accumulate over long-term planning. In this work, we improve upon the limitations of model-free RL in multi-goal domains. We do this by adapting the Floyd-Warshall algorithm for RL and call the adaptation Floyd-Warshall RL (FWRL). The proposed algorithm learns a goal-conditioned action-value function by constraining the value of the optimal path between any two states to be greater than or equal to the value of paths via intermediary states. Experimentally, we show that FWRL is more sample-efficient and learns higher reward strategies in multi-goal tasks as compared to Q-learning, model-based RL and other relevant baselines in a tabular domain.

READ FULL TEXT

page 5

page 6

research
10/03/2022

CostNet: An End-to-End Framework for Goal-Directed Reinforcement Learning

Reinforcement Learning (RL) is a general framework concerned with an age...
research
05/13/2021

MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks

In Goal-oriented Reinforcement learning, relabeling the raw goals in pas...
research
01/31/2023

Learning, Fast and Slow: A Goal-Directed Memory-Based Approach for Dynamic Environments

Model-based next state prediction and state value prediction are slow to...
research
04/23/2020

Guided Dyna-Q for Mobile Robot Exploration and Navigation

Model-based reinforcement learning (RL) enables an agent to learn world ...
research
12/30/2019

World Programs for Model-Based Learning and Planning in Compositional State and Action Spaces

Some of the most important tasks take place in environments which lack c...
research
01/05/2020

Universal Successor Features for Transfer Reinforcement Learning

Transfer in Reinforcement Learning (RL) refers to the idea of applying k...
research
10/09/2019

Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models

Humans are masters at quickly learning many complex tasks, relying on an...

Please sign up or login with your details

Forgot password? Click here to reset