Optimal and instance-dependent guarantees for Markovian linear stochastic approximation

12/23/2021
by   Wenlong Mou, et al.
0

We study stochastic approximation procedures for approximately solving a d-dimensional linear fixed point equation based on observing a trajectory of length n from an ergodic Markov chain. We first exhibit a non-asymptotic bound of the order t_mixdn on the squared error of the last iterate of a standard scheme, where t_mix is a mixing time. We then prove a non-asymptotic instance-dependent bound on a suitably averaged sequence of iterates, with a leading term that matches the local asymptotic minimax limit, including sharp dependence on the parameters (d, t_mix) in the higher order terms. We complement these upper bounds with a non-asymptotic minimax lower bound that establishes the instance-optimality of the averaged SA estimator. We derive corollaries of these results for policy evaluation with Markov noise – covering the TD(λ) family of algorithms for all λ∈ [0, 1) – and linear autoregressive models. Our instance-dependent characterizations open the door to the design of fine-grained model selection procedures for hyperparameter tuning (e.g., choosing the value of λ when running the TD(λ) algorithm).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/10/2022

Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates of Linear Stochastic Approximation

This paper provides a finite-time analysis of linear stochastic approxim...
research
12/09/2020

Optimal oracle inequalities for solving projected fixed-point equations

Linear fixed point equations in Hilbert spaces arise in a variety of set...
research
12/24/2021

Accelerated and instance-optimal policy evaluation with linear function approximation

We study the problem of policy evaluation with linear function approxima...
research
05/18/2023

The noise level in linear regression with dependent data

We derive upper bounds for random design linear regression with dependen...
research
11/07/2022

Policy evaluation from a single path: Multi-step methods, mixing and mis-specification

We study non-parametric estimation of the value function of an infinite-...
research
10/03/2022

Bias and Extrapolation in Markovian Linear Stochastic Approximation with Constant Stepsizes

We consider Linear Stochastic Approximation (LSA) with a constant stepsi...
research
03/16/2020

Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis

We address the problem of policy evaluation in discounted Markov decisio...

Please sign up or login with your details

Forgot password? Click here to reset