DeepAI AI Chat
Log In Sign Up

Learning from many trajectories

by   Stephen Tu, et al.

We initiate a study of supervised learning from many independent sequences ("trajectories") of non-independent covariates, reflecting tasks in sequence modeling, control, and reinforcement learning. Conceptually, our multi-trajectory setup sits between two traditional settings in statistical learning theory: learning from independent examples and learning from a single auto-correlated sequence. Our conditions for efficient learning generalize the former setting–trajectories must be non-degenerate in ways that extend standard requirements for independent examples. They do not require that trajectories be ergodic, long, nor strictly stable. For linear least-squares regression, given n-dimensional examples produced by m trajectories, each of length T, we observe a notable change in statistical efficiency as the number of trajectories increases from a few (namely m ≲ n) to many (namely m ≳ n). Specifically, we establish that the worst-case error rate this problem is Θ(n / m T) whenever m ≳ n. Meanwhile, when m ≲ n, we establish a (sharp) lower bound of Ω(n^2 / m^2 T) on the worst-case error rate, realized by a simple, marginally unstable linear dynamical system. A key upshot is that, in domains where trajectories regularly reset, the error rate eventually behaves as if all of the examples were independent altogether, drawn from their marginals. As a corollary of our analysis, we also improve guarantees for the linear system identification problem.


page 1

page 2

page 3

page 4


New CleverHans Feature: Better Adversarial Robustness Evaluations with Attack Bundling

This technical report describes a new feature of the CleverHans library ...

Learning the Dynamics of Autonomous Linear Systems From Multiple Trajectories

We consider the problem of learning the dynamics of autonomous linear sy...

Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms

We study the problem of least squares linear regression where the data-p...

Approximating Familywise Error Rate for Correlated Normal

We study the limiting behavior of the familywise error rate (FWER) of th...

`local' vs. `global' parameters -- breaking the gaussian complexity barrier

We show that if F is a convex class of functions that is L-subgaussian, ...

Finite-time Identification of Stable Linear Systems: Optimality of the Least-Squares Estimator

We provide a new finite-time analysis of the estimation error of stable ...

Learning Linear Dynamical Systems with Semi-Parametric Least Squares

We analyze a simple prefiltered variation of the least squares estimator...