Reinforcement learning for linear-convex models with jumps via stability analysis of feedback controls

04/19/2021
by   Xin Guo, et al.
17

We study finite-time horizon continuous-time linear-convex reinforcement learning problems in an episodic setting. In this problem, the unknown linear jump-diffusion process is controlled subject to nonsmooth convex costs. We show that the associated linear-convex control problems admit Lipchitz continuous optimal feedback controls and further prove the Lipschitz stability of the feedback controls, i.e., the performance gap between applying feedback controls for an incorrect model and for the true model depends Lipschitz-continuously on the magnitude of perturbations in the model coefficients; the proof relies on a stability analysis of the associated forward-backward stochastic differential equation. We then propose a novel least-squares algorithm which achieves a regret of the order O(√(Nln N)) on linear-convex learning problems with jumps, where N is the number of learning episodes; the analysis leverages the Lipschitz stability of feedback controls and concentration properties of sub-Weibull random variables.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/09/2020

Regularity and stability of feedback relaxed controls

This paper proposes a relaxed control regularization with general explor...
research
03/22/2022

Linear convergence of a policy gradient method for finite horizon continuous time stochastic control problems

Despite its popularity in the reinforcement learning community, a provab...
research
10/27/2020

Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls

In this paper, we propose Q-learning algorithms for continuous-time dete...
research
11/02/2020

Reinforcement Learning of Structured Control for Linear Systems with Unknown State Matrix

This paper delves into designing stabilizing feedback control gains for ...
research
09/16/2021

Adaptive Control of Quadratic Costs in Linear Stochastic Differential Equations

We study a canonical problem in adaptive control; design and analysis of...
research
12/26/2019

Convergence and sample complexity of gradient methods for the model-free linear quadratic regulator problem

Model-free reinforcement learning attempts to find an optimal control ac...
research
09/12/2021

Concave Utility Reinforcement Learning with Zero-Constraint Violations

We consider the problem of tabular infinite horizon concave utility rein...

Please sign up or login with your details

Forgot password? Click here to reset