The Gap Between Model-Based and Model-Free Methods on the Linear Quadratic Regulator: An Asymptotic Viewpoint

12/09/2018
by   Stephen Tu, et al.
6

The effectiveness of model-based versus model-free methods is a long-standing question in reinforcement learning (RL). Motivated by recent empirical success of RL on continuous control tasks, we study the sample complexity of popular model-based and model-free algorithms on the Linear Quadratic Regulator (LQR). We show that for policy evaluation, a simple model-based plugin method requires asymptotically less samples than the classical least-squares temporal difference (LSTD) estimator to reach the same quality of solution; the sample complexity gap between the two methods can be at least a factor of state dimension. For policy evaluation, we study a simple family of problem instances and show that nominal (certainty equivalence principle) control also requires a factor of state dimension fewer samples than the policy gradient method to reach the same level of control performance on these instances. Furthermore, the gap persists even when employing baselines commonly used in practice. To the best of our knowledge, this is the first theoretical result which demonstrates a separation in the sample complexity between model-based and model-free methods on a continuous control task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2018

Temporal Difference Models: Model-Free Deep RL for Model-Based Control

Model-free reinforcement learning (RL) is a powerful, general tool for l...
research
06/12/2020

Combining Model-Based and Model-Free Methods for Nonlinear Control: A Provably Convergent Policy Gradient Approach

Model-free learning-based control methods have seen great success recent...
research
05/30/2019

Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator

We study the sample complexity of approximate policy iteration (PI) for ...
research
04/26/2023

Reinforcement Learning with Partial Parametric Model Knowledge

We adapt reinforcement learning (RL) methods for continuous control to b...
research
12/22/2017

Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator

Reinforcement learning (RL) has been successfully used to solve many con...
research
04/19/2023

Sample-efficient Model-based Reinforcement Learning for Quantum Control

We propose a model-based reinforcement learning (RL) approach for noisy ...
research
02/21/2017

Sample Efficient Policy Search for Optimal Stopping Domains

Optimal stopping problems consider the question of deciding when to stop...

Please sign up or login with your details

Forgot password? Click here to reset