Log In Sign Up

Information Theoretic Regret Bounds for Online Nonlinear Control

by   Sham Kakade, et al.

This work studies the problem of sequential control in an unknown, nonlinear dynamical system, where we model the underlying system dynamics as an unknown function in a known Reproducing Kernel Hilbert Space. This framework yields a general setting that permits discrete and continuous control inputs as well as non-smooth, non-differentiable dynamics. Our main result, the Lower Confidence-based Continuous Control (LC^3) algorithm, enjoys a near-optimal O(√(T)) regret bound against the optimal controller in episodic settings, where T is the number of episodes. The bound has no explicit dependence on dimension of the system dynamics, which could be infinite, but instead only depends on information theoretic quantities. We empirically show its application to a number of nonlinear control tasks and demonstrate the benefit of exploration for learning model dynamics.


page 13

page 36


An Information-Theoretic Analysis for Thompson Sampling with Many Actions

Information-theoretic Bayesian regret bounds of Russo and Van Roy captur...

Robust Online Control with Model Misspecification

We study online control of an unknown nonlinear dynamical system that is...

Random features for adaptive nonlinear control and prediction

A key assumption in the theory of adaptive control for nonlinear systems...

Empowerment for Continuous Agent-Environment Systems

This paper develops generalizations of empowerment to continuous states....

Information-Theoretic Confidence Bounds for Reinforcement Learning

We integrate information-theoretic concepts into the design and analysis...

Towards a Dimension-Free Understanding of Adaptive Linear Control

We study the problem of adaptive control of the linear quadratic regulat...

Fundamental Limitations of Control and Filtering in Continuous-Time Systems: An Information-Theoretic Analysis

While information-theoretic methods have been introduced to investigate ...

Code Repositories


Information Theoretic Regret Bounds for Online Nonlinear Control

view repo