The Value of Planning for Infinite-Horizon Model Predictive Control

by   Nathan Hatch, et al.

Model Predictive Control (MPC) is a classic tool for optimal control of complex, real-world systems. Although it has been successfully applied to a wide range of challenging tasks in robotics, it is fundamentally limited by the prediction horizon, which, if too short, will result in myopic decisions. Recently, several papers have suggested using a learned value function as the terminal cost for MPC. If the value function is accurate, it effectively allows MPC to reason over an infinite horizon. Unfortunately, Reinforcement Learning (RL) solutions to value function approximation can be difficult to realize for robotics tasks. In this paper, we suggest a more efficient method for value function approximation that applies to goal-directed problems, like reaching and navigation. In these problems, MPC is often formulated to track a path or trajectory returned by a planner. However, this strategy is brittle in that unexpected perturbations to the robot will require replanning, which can be costly at runtime. Instead, we show how the intermediate data structures used by modern planners can be interpreted as an approximate value function. We show that that this value function can be used by MPC directly, resulting in more efficient and resilient behavior at runtime.


page 5

page 6

page 7


Blending MPC Value Function Approximation for Efficient Reinforcement Learning

Model-Predictive Control (MPC) is a powerful tool for controlling comple...

Neural Lyapunov Model Predictive Control

This paper presents Neural Lyapunov MPC, an algorithm to alternately tra...

Probabilistic Iterative LQR for Short Time Horizon MPC

Optimal control is often used in robotics for planning a trajectory to a...

Tailored neural networks for learning optimal value functions in MPC

Learning-based predictive control is a promising alternative to optimiza...

Information Theoretic Model Predictive Q-Learning

Model-free Reinforcement Learning (RL) algorithms work well in sequentia...

Separating value functions across time-scales

In many finite horizon episodic reinforcement learning (RL) settings, it...

Online Multi-Contact Receding Horizon Planning via Value Function Approximation

Planning multi-contact motions in a receding horizon fashion requires a ...

Please sign up or login with your details

Forgot password? Click here to reset