Frequency-based Search-control in Dyna

02/14/2020
by   Yangchen Pan, et al.
0

Model-based reinforcement learning has been empirically demonstrated as a successful strategy to improve sample efficiency. In particular, Dyna is an elegant model-based architecture integrating learning and planning that provides huge flexibility of using a model. One of the most important components in Dyna is called search-control, which refers to the process of generating state or state-action pairs from which we query the model to acquire simulated experiences. Search-control is critical in improving learning efficiency. In this work, we propose a simple and novel search-control strategy by searching high frequency regions of the value function. Our main intuition is built on Shannon sampling theorem from signal processing, which indicates that a high frequency signal requires more samples to reconstruct. We empirically show that a high frequency function is more difficult to approximate. This suggests a search-control strategy: we should use states from high frequency regions of the value function to query the model to acquire more samples. We develop a simple strategy to locally measure the frequency of a function by gradient and hessian norms, and provide theoretical justification for this approach. We then apply our strategy to search-control in Dyna, and conduct experiments to show its property and effectiveness on benchmark domains.

READ FULL TEXT
research
06/18/2019

Hill Climbing on Value Estimates for Search-control in Dyna

Dyna is an architecture for model-based reinforcement learning (RL), whe...
research
12/31/2019

The Gambler's Problem and Beyond

We analyze the Gambler's problem, a simple reinforcement learning proble...
research
06/15/2022

HF-NeuS: Improved Surface Reconstruction Using High-Frequency Details

Neural rendering can be used to reconstruct implicit representations of ...
research
04/17/2021

Planning with Expectation Models for Control

In model-based reinforcement learning (MBRL), Wan et al. (2019) showed c...
research
11/04/2022

The Benefits of Model-Based Generalization in Reinforcement Learning

Model-Based Reinforcement Learning (RL) is widely believed to have the p...
research
03/28/2022

Revisiting Model-based Value Expansion

Model-based value expansion methods promise to improve the quality of va...
research
02/17/2020

Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning

The choice of the control frequency of a system has a relevant impact on...

Please sign up or login with your details

Forgot password? Click here to reset