Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms

06/16/2020
by   Guy Bresler, et al.
10

We study the problem of least squares linear regression where the data-points are dependent and are sampled from a Markov chain. We establish sharp information theoretic minimax lower bounds for this problem in terms of τ_mix, the mixing time of the underlying Markov chain, under different noise settings. Our results establish that in general, optimization with Markovian data is strictly harder than optimization with independent data and a trivial algorithm (SGD-DD) that works with only one in every Θ̃(τ_mix) samples, which are approximately independent, is minimax optimal. In fact, it is strictly better than the popular Stochastic Gradient Descent (SGD) method with constant step-size which is otherwise minimax optimal in the regression with independent data setting. Beyond a worst case analysis, we investigate whether structured datasets seen in practice such as Gaussian auto-regressive dynamics can admit more efficient optimization schemes. Surprisingly, even in this specific and natural setting, Stochastic Gradient Descent (SGD) with constant step-size is still no better than SGD-DD. Instead, we propose an algorithm based on experience replay–a popular reinforcement learning technique–that achieves a significantly better error rate. Our improved rate serves as one of the first results where an algorithm outperforms SGD-DD on an interesting Markov chain and also provides one of the first theoretical analyses to support the use of experience replay in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2021

Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression

Stochastic gradient descent (SGD) has been demonstrated to generalize we...
research
10/25/2017

A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

This work provides a simplified proof of the statistical minimax optimal...
research
03/24/2020

Finite-Time Analysis of Stochastic Gradient Descent under Markov Randomness

Motivated by broad applications in reinforcement learning and machine le...
research
07/03/2022

Distributed Online System Identification for LTI Systems Using Reverse Experience Replay

Identification of linear time-invariant (LTI) systems plays an important...
research
04/11/2014

Markov Chain Analysis of Evolution Strategies on a Linear Constraint Optimization Problem

This paper analyses a (1,λ)-Evolution Strategy, a randomised comparison-...
research
09/11/2019

Implicit Regularization for Optimal Sparse Recovery

We investigate implicit regularization schemes for gradient descent meth...
research
03/31/2022

Learning from many trajectories

We initiate a study of supervised learning from many independent sequenc...

Please sign up or login with your details

Forgot password? Click here to reset