Differentiable Robust LQR Layers

06/10/2021
by   Ngo Anh Vien, et al.
0

This paper proposes a differentiable robust LQR layer for reinforcement learning and imitation learning under model uncertainty and stochastic dynamics. The robust LQR layer can exploit the advantages of robust optimal control and model-free learning. It provides a new type of inductive bias for stochasticity and uncertainty modeling in control systems. In particular, we propose an efficient way to differentiate through a robust LQR optimization program by rewriting it as a convex program (i.e. semi-definite program) of the worst-case cost. Based on recent work on using convex optimization inside neural network layers, we develop a fully differentiable layer for optimizing this worst-case cost, i.e. we compute the derivative of a performance measure w.r.t the model's unknown parameters, model uncertainty and stochasticity parameters. We demonstrate the proposed method on imitation learning and approximate dynamic programming on stochastic and uncertain domains. The experiment results show that the proposed method can optimize robust policies under uncertain situations, and are able to achieve a significantly better performance than existing methods that do not model uncertainty directly.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2019

Robust exploration in linear quadratic reinforcement learning

This paper concerns the problem of learning control policies for an unkn...
research
08/21/2020

Adversarial Imitation Learning via Random Search

Developing agents that can perform challenging complex tasks is the goal...
research
06/29/2017

Path Integral Networks: End-to-End Differentiable Optimal Control

In this paper, we introduce Path Integral Networks (PI-Net), a recurrent...
research
03/28/2023

Worst-Case Control and Learning Using Partial Observations Over an Infinite Time-Horizon

Safety-critical cyber-physical systems require control strategies whose ...
research
06/24/2021

Shallow Representation is Deep: Learning Uncertainty-aware and Worst-case Random Feature Dynamics

Random features is a powerful universal function approximator that inher...
research
01/12/2023

Approximate Information States for Worst-Case Control and Learning in Uncertain Systems

In this paper, we investigate discrete-time decision-making problems in ...
research
02/26/2021

A Regret Minimization Approach to Iterative Learning Control

We consider the setting of iterative learning control, or model-based po...

Please sign up or login with your details

Forgot password? Click here to reset