Combining Model-Based and Model-Free Methods for Nonlinear Control: A Provably Convergent Policy Gradient Approach

06/12/2020
by   Guannan Qu, et al.
0

Model-free learning-based control methods have seen great success recently. However, such methods typically suffer from poor sample complexity and limited convergence guarantees. This is in sharp contrast to classical model-based control, which has a rich theory but typically requires strong modeling assumptions. In this paper, we combine the two approaches to achieve the best of both worlds. We consider a dynamical system with both linear and non-linear components and develop a novel approach to use the linear model to define a warm start for a model-free, policy gradient method. We show this hybrid approach outperforms the model-based controller while avoiding the convergence issues associated with model-free approaches via both numerical experiments and theoretical analyses, in which we derive sufficient conditions on the non-linear component such that our approach is guaranteed to converge to the (nearly) global optimal controller.

READ FULL TEXT
research
12/09/2018

The Gap Between Model-Based and Model-Free Methods on the Linear Quadratic Regulator: An Asymptotic Viewpoint

The effectiveness of model-based versus model-free methods is a long-sta...
research
02/25/2021

Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √(T) Regret

We consider the task of learning to control a linear dynamical system un...
research
06/25/2019

Uncertainty-aware Model-based Policy Optimization

Model-based reinforcement learning has the potential to be more sample e...
research
10/13/2021

Stabilizing Dynamical Systems via Policy Gradient Methods

Stabilizing an unknown control system is one of the most fundamental pro...
research
10/09/2016

Visual Closed-Loop Control for Pouring Liquids

Pouring a specific amount of liquid is a challenging task. In this paper...
research
11/21/2017

Deterministic Policy Optimization by Combining Pathwise and Score Function Estimators for Discrete Action Spaces

Policy optimization methods have shown great promise in solving complex ...
research
02/18/2020

D2C 2.0: Decoupled Data-Based Approach for Learning to Control Stochastic Nonlinear Systems via Model-Free ILQR

In this paper, we propose a structured linear parameterization of a feed...

Please sign up or login with your details

Forgot password? Click here to reset