Hamiltonian Q-Learning: Leveraging Importance-sampling for Data Efficient RL

11/11/2020
by   Udari Madhushani, et al.
7

Model-free reinforcement learning (RL), in particular Q-learning is widely used to learn optimal policies for a variety of planning and control problems. However, when the underlying state-transition dynamics are stochastic and high-dimensional, Q-learning requires a large amount of data and incurs a prohibitively high computational cost. In this paper, we introduce Hamiltonian Q-Learning, a data efficient modification of the Q-learning approach, which adopts an importance-sampling based technique for computing the Q function. To exploit stochastic structure of the state-transition dynamics, we employ Hamiltonian Monte Carlo to update Q function estimates by approximating the expected future rewards using Q values associated with a subset of next states. Further, to exploit the latent low-rank structure of the dynamic system, Hamiltonian Q-Learning uses a matrix completion algorithm to reconstruct the updated Q function from Q value updates over a much smaller subset of state-action pairs. By providing an efficient way to apply Q-learning in stochastic, high-dimensional problems, the proposed approach broadens the scope of RL algorithms for real-world applications, including classical control tasks and environmental monitoring.

READ FULL TEXT

page 6

page 11

page 13

research
09/26/2019

Harnessing Structures for Value-Based Planning and Reinforcement Learning

Value-based methods constitute a fundamental methodology in planning and...
research
09/27/2022

Hamiltonian Adaptive Importance Sampling

Importance sampling (IS) is a powerful Monte Carlo (MC) methodology for ...
research
03/22/2021

Improving Actor-Critic Reinforcement Learning via Hamiltonian Policy

Approximating optimal policies in reinforcement learning (RL) is often n...
research
10/15/2022

A multilevel reinforcement learning framework for PDE based control

Reinforcement learning (RL) is a promising method to solve control probl...
research
03/04/2021

Conservative Optimistic Policy Optimization via Multiple Importance Sampling

Reinforcement Learning (RL) has been able to solve hard problems such as...
research
07/01/2020

Hamiltonian MCMC methods for estimating rare events probabilities in high-dimensional problems

Accurate and efficient estimation of rare events probabilities is of sig...
research
11/10/2021

SyMetric: Measuring the Quality of Learnt Hamiltonian Dynamics Inferred from Vision

A recently proposed class of models attempts to learn latent dynamics fr...

Please sign up or login with your details

Forgot password? Click here to reset