Provable Regret Bounds for Deep Online Learning and Control

10/15/2021
by   Xinyi Chen, et al.
0

The use of deep neural networks has been highly successful in reinforcement learning and control, although few theoretical guarantees for deep learning exist for these problems. There are two main challenges for deriving performance guarantees: a) control has state information and thus is inherently online and b) deep networks are non-convex predictors for which online learning cannot provide provable guarantees in general. Building on the linearization technique for overparameterized neural networks, we derive provable regret bounds for efficient online learning with deep neural networks. Specifically, we show that over any sequence of convex loss functions, any low-regret algorithm can be adapted to optimize the parameters of a neural network such that it competes with the best net in hindsight. As an application of these results in the online setting, we obtain provable bounds for online episodic control with deep neural network controllers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2021

Projection-free Distributed Online Learning with Strongly Convex Losses

To efficiently solve distributed online learning problems with complicat...
research
05/22/2023

Hierarchical Partitioning Forecaster

In this work we consider a new family of algorithms for sequential predi...
research
05/11/2013

On the Generalization Ability of Online Learning Algorithms for Pairwise Loss Functions

In this paper, we study the generalization properties of online learning...
research
05/26/2019

Nonparametric Online Learning Using Lipschitz Regularized Deep Neural Networks

Deep neural networks are considered to be state of the art models in man...
research
10/24/2014

Online and Stochastic Gradient Methods for Non-decomposable Loss Functions

Modern applications in sensitive domains such as biometrics and medicine...
research
12/31/2016

Lazily Adapted Constant Kinky Inference for Nonparametric Regression and Model-Reference Adaptive Control

Techniques known as Nonlinear Set Membership prediction, Lipschitz Inter...
research
08/30/2020

A Meta-Learning Control Algorithm with Provable Finite-Time Guarantees

In this work we provide provable regret guarantees for an online meta-le...

Please sign up or login with your details

Forgot password? Click here to reset