Q-learning with Nearest Neighbors

02/12/2018
by   Devavrat Shah, et al.
0

We consider the problem of model-free reinforcement learning for infinite-horizon discounted Markov Decision Processes (MDPs) with a continuous state space and unknown transition kernels, when only a single sample path of the system is available. We focus on the classical approach of Q-learning where the goal is to learn the optimal Q-function. We propose the Nearest Neighbor Q-Learning approach that utilizes nearest neighbor regression method to learn the Q function. We provide finite sample analysis of the convergence rate using this method. In particular, we establish that the algorithm is guaranteed to output an ϵ-accurate estimate of the optimal Q-function with high probability using a number of observations that depends polynomially on ϵ and the model parameters. To establish our results, we develop a robust version of stochastic approximation results; this may be of interest in its own right.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/03/2023

Minimax Optimal Q Learning with Nearest Neighbors

Q learning is a popular model free reinforcement learning method. Most o...
research
02/06/2019

Finite-Sample Analysis for SARSA and Q-Learning with Linear Function Approximation

Though the convergence of major reinforcement learning algorithms has be...
research
06/05/2016

Finite-Sample Analysis of Fixed-k Nearest Neighbor Density Functional Estimators

We provide finite-sample analysis of a general framework for using k-nea...
research
04/26/2019

From Predictions to Prescriptions in Multistage Optimization Problems

In this paper, we introduce a framework for solving finite-horizon multi...
research
10/17/2014

A Hierarchical Multi-Output Nearest Neighbor Model for Multi-Output Dependence Learning

Multi-Output Dependence (MOD) learning is a generalization of standard c...
research
01/13/2022

Certifiable Robustness for Nearest Neighbor Classifiers

ML models are typically trained using large datasets of high quality. Ho...
research
02/14/2019

On Reinforcement Learning Using Monte Carlo Tree Search with Supervised Learning: Non-Asymptotic Analysis

Inspired by the success of AlphaGo Zero (AGZ) which utilizes Monte Carlo...

Please sign up or login with your details

Forgot password? Click here to reset