Heuristic-Guided Reinforcement Learning

06/05/2021
by   Ching-An Cheng, et al.
0

We provide a framework for accelerating reinforcement learning (RL) algorithms by heuristics constructed from domain knowledge or offline data. Tabula rasa RL algorithms require environment interactions or computation that scales with the horizon of the sequential decision-making task. Using our framework, we show how heuristic-guided RL induces a much shorter-horizon subproblem that provably solves the original task. Our framework can be viewed as a horizon-based regularization for controlling bias and variance in RL under a finite interaction budget. On the theoretical side, we characterize properties of a good heuristic and its impact on RL acceleration. In particular, we introduce the novel concept of an "improvable heuristic" – a heuristic that allows an RL agent to extrapolate beyond its prior knowledge. On the empirical side, we instantiate our framework to accelerate several state-of-the-art algorithms in simulated robotic control tasks and procedurally generated games. Our framework complements the rich literature on warm-starting RL with expert demonstrations or exploratory datasets, and introduces a principled method for injecting prior knowledge into RL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/12/2022

RLang: A Declarative Language for Expression Prior Knowledge for Reinforcement Learning

Communicating useful background knowledge to reinforcement learning (RL)...
research
06/16/2019

Reinforcement Learning Driven Heuristic Optimization

Heuristic algorithms such as simulated annealing, Concorde, and METIS ar...
research
06/17/2023

Vanishing Bias Heuristic-guided Reinforcement Learning Algorithm

Reinforcement Learning has achieved tremendous success in the many Atari...
research
11/22/2021

Bridging the gap between learning and heuristic based pushing policies

Non-prehensile pushing actions have the potential to singulate a target ...
research
09/08/2022

FORLORN: A Framework for Comparing Offline Methods and Reinforcement Learning for Optimization of RAN Parameters

The growing complexity and capacity demands for mobile networks necessit...
research
05/27/2022

Provably Sample-Efficient RL with Side Information about Latent Dynamics

We study reinforcement learning (RL) in settings where observations are ...
research
05/25/2023

Reward-Machine-Guided, Self-Paced Reinforcement Learning

Self-paced reinforcement learning (RL) aims to improve the data efficien...

Please sign up or login with your details

Forgot password? Click here to reset